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ABSTRACT 

Strong gravitational lensing is a powerful technique for probing galaxy mass distributions and for 
measuring cosmological parameters. Lens systems with extended source-intensity distributions are 
particularly useful for this purpose since they provide additional constraints on the lens potential 
(mass distribution). We present a pixelated approach to modeling the lens potential and source- 
intensity distribution simultaneously. The method makes iterative and perturbative corrections to an 
initial potential model. For systems with sources of sufficient extent such that the separate lensed 
images are connected by intensity measurements, the accuracy in the reconstructed potential is solely 
limited by the quality of the data. We apply this potential reconstruction technique to deep Hubble 
Space Telescope observations of B1608-I-656, a four-image gravitational lens system formed by a pair of 
interacting lens galaxies. We present a comprehensive Bayesian analysis of the system that takes into 
account the extended source-intensity distribution, dust extinction, and the interacting lens galaxies. 
Our approach allows us to compare various models of the components of the lens system, which 
include the point-spread function (PSF), dust, lens galaxy light, source-intensity distribution, and lens 
potential. Using optimal combinations of the PSF, dust, and lens galaxy light models, we successfully 
reconstruct both the lens potential and the extended source-intensity distribution of B1608+656. 
The resulting reconstruction can be used as the basis of a measurement of the Hubble constant. 
As an illustration of the astrophysical applications of our method, we use our reconstruction of the 
gravitational potential to study the relative distribution of mass and light in the lensing galaxies. 
We find that the mass-to-light ratio for the primary lens galaxy is (2.0 ± O.2)/iM0Lg g within the 

Einstein radius (3.9/i~^ kpc), in agreement with what is found for noninteracting lens galaxies at the 
same scales. 

Subject headings: gravitational lensing: general — gravitational lensing: individual (B1608-I-656) 
— methods: data analysis — galaxies: elliptical and lenticular, cD — galaxies: 
structure 



1. INTRODUCTION 

Strong gravitational lens systems provide a tool for 
probing galaxy mass distributions (independent of 
their light profil es) and for measuring co smological 
param eters (e.g. iKochanek. Schneider, fc Wamb sganss 
l2006l and references therein). Lens systems with 
extended source-intensity distributions are of special 
interest because they provide additional constraints on 
the lens potential (and hence the surface mass density) 
due to surface brightness conservation. In this case. 
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simultaneous determination of the source-intensity dis- 
tribution and the lens potential is needed. To describe 
cither the source- intensity or the lens potential/mass 
distribution, there are two approaches in the literature: 
(1) "parametric," or better, "simply parameterized," 
using simple, physically motivated functional forms 
descr i bed by a few (^ 10) parameters (e.g., iK ochanefl 
TMI: iKneib et all [l99fit iKeetonI [2OOII : iMarshalll l200l 
Julio et al.l I2007D . and (2) pixel-based ("pixelated," 



or "free-form," or sometimes, inaccurately, "nonpara- 
metric") modeling on a grid, which has been done 
for b o th the source intens it y Ce.g.. IWa Uington ct aTl 



19961 IWarren & Dye 


120031: Treu & KooDmansI l200l 


Dve & Warred 20051: 


Koo 


pmansl 20051; Brewer & LewisI 


200(1 Suvu et all 


2006 


: iWavth & Websted 120061: 


Dve et al.l 2008') and 


the lens potential/mass distribu- 



tion (e.g ., WiUiams & Saha 2000; Bradac ct al. 200^ 
'KooDinans 2005; Saha et all 120061: ISuvu & BlandforJ 
i2006; .Jee et aL,2007: .Vegetti fc Koopmana 2008, ') . Most 
of the developed lens modeling methods are simply 
parameterized. In particular, for the measurement of 
the Hubble constant , lens potential/mass models prior 
to ISaha et al] (2006) have been simply parameterized 
because most of the strong lens systems with time delay 
measurements have only point sources (as opposed to 
extended sources) to constrain the lens potential/mass 
distribution. A precise measurement of the value of 
Hq is important for testing the flat A-cold dark matter 
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(CDM) model and studying dark energy. The cosmic 
microwave background (CMB) allows determination of 
cosmological param eters with high accu racy with the 
exception of Hq fe.g. iKomatsu et 311120081 ). An indepen- 
dent measurement of Hq to better than a few percent 
precision provides the single most useful co mplement to 
t he CMB for dark e nergy studies (lHull2005D . 

iKoopmani ()2005[ ) developed a method for pixelated 
source-intensity and lens potential reconstruction that 
is based on the potent ial correc tion s cheme proposed 
by iBlandford. Surpi. fc Kundi g (|2001f ). This pixe- 
lated potential reconstruction method is applicable to 
lens systems with extended source-intensity distribu- 
tions. Pixel-based modeling has the advantage over 
simply-parameterized modeling in the flexibility in the 
parametrization. This is especially important in complex 
lens systems (e.g. multicomponent source galaxies or 
multiple lens galaxies) where simply-parameterized mod- 
els may become inadequate. Furthermore, pixel-based 
modeling has the capabilities of detecting dar k mat- 
ter su bstructures Qioopmans 2005; Vegetti fc KoopmansI 
[20081) . 

In this paper, we prese nt a lens rn o deling technique 
that is similar to that of iKoopmaiii (|2005f ). but in a 
Bayesian framework to allow quantitative comparison be- 
tween various source intensity and lens potential models. 
The point-spread function (PSF), lens galaxy light, and 
dust models are also incorporated in this scheme. There- 
fore, this method provides a way to rank these data mod- 
els (with the five interdependent components: source- 
intensity distribution, lens potential, PSF, lens galaxy 
light and dust) quantitatively. There are also propa- 
gation effects due to structures along the line of sight 
(LOS), but we ignore this for now and characterize this 
in a forthcoming paper (Paper II). 

We choose to reconstruct the lens potential instead 
of the surface mass density because (1) it is the quan- 
tity that directly relates to the cosmological parameters 
via the time delays and angular diameter distance ra- 
tios, and (2) the surface mass density can, in princi- 
ple, be easily ob t ained by differentiation. In contrast, 
IWilliams fc Sahal ([2000') and lSaha et all (|200g ) pixchzcd 
the surface mass density. Since the surface mass density 
over the entire lens plane is required in the integral for 
obtaining the lens potential, the conversion of the (fi- 
nite) gridded mass density to the lens potential is not 
straightforward . 

We apply the pixel ated potent ial reconstruction 

method to B1608-I-656 (|Mvers et al.lll995f ). a quadru- 
ple image gravitational le ns system with an extended 
source at Zs = 1.394 (Fas snacht et al.l [19961). and two 
interacting galaxy lenses at = 0.6304 ( Mvers et al.l 
[T995i) . B1608-I-656 is special in that it is the only four- 
image gravitational lens systems with all three indepen- 
dent time delays between the image s measured with er - 
rors of only a few percent (Fassnach t et al.lll999l I2002D . 
Thus, it provides a great opportunity to measure the 
Hubble constant, which is the subject of Paper II. To ob- 
tain the Hubble constant to hig h precision, an accurate 
lens p otential model is crucial. iKoopmans fc Fassnachtl 
()1999t ) modeled this system using simply-parameterized 
lens potentials, but did not account fo r the presence of 
dust a nd the extended source intensity. iKoopmans et all 
(j2003l ) improved on the simply-parameterized modeling 



of the lens potential with the treatment of dust, the use of 
a simply-parameterized extended source-intensity distri- 
bution, and the i nclusion of constrai nts from stellar dy- 
namics. However. ISuvu fc BlandfordI (2006) showed that 
thi s most up-to-dat e simp ly-parameterized lens model 
in IKoopmans et al.l (|2003l ) fails certain tests such as 
the crossing of the critical curve through the saddle 
point of the figure-eight-shaped intensity contour of the 
merging images. This suggests that the pixelated po- 
tential method may be better suited than a simply- 
parameterized method for the two interacting galaxies. 
In this paper, we deliver a comprehensive analysis of the 
B1608-I-656 system that incorporates the effects of the 
extended source intensity, presence of dust, and inter- 
acting lenses. The dissection of B1608-(~656 allows us to 
study the relative distribution of mass and light in the 
interacting lens galaxies. 

The outline of the paper is as follows. In Section [2l we 
introduce the pixelated potential reconstruction method. 
We demonstrate the method using simulated data in Sec- 
tion[3]and generalize the method to real data in SectionUj 
The remaining sections of the paper target B1608-I-656. 
In Section [5l we summarize the Hubble Space Telescope 
(HST) observations of B1608-I-656 and present the im- 
age processing. In Section [6l we show a pixelated po- 
tential reconstruction of B1608-f656. Finally, in Sec- 
tion [71 we comment on the mass-to-light (M/L) ratio in 
B1608-I-656 based on the results of our lensing analysis. 
In Paper II, we use the resulting potential reconstruction 
of B1608-I-656 together with a study of the lens environ- 
ment to infer the value of the Hubble constant. 

Throughout this paper, we assume a flat A-CDM 
universe with = 0.26, f^A = 0.74 , and Hq — 

100/ikms"^Mpc"^ ([Komatsu et al.ll2008[ ). For the lens 
and source redshifts in B1608-j-656, 1" on the sky corre- 
sponds to 4.9ft.~^kpc on the lens plane and 6.1/i^^kpc 
on the source plane. 

2. PIXELATED POTENTIAL RECONSTRUCTION 

In the following subsections, we present the pixelated 
potential reconstruction method. Section 12.11 contains 
the formalism of the method, and Section [2?2| is a prac- 
tical implementation of the method. 

2.1. Formalism for iterative and perturbative potential 
corrections 

The iterative and perturbative potential correction 
scheme for len s systems with e x tended sources was first 
suggested b y IBlandford et al.l (|2001|) and s tudied by 
iKoopmand (2005"), 'Suyu fc Blandford (j2006l) . and re- 
cently by [Vcgetti & Koopmans (200_3). The pixelated 
potential reconstruction method t hat w e present here 
is similar to that in iKoopmani (|2005D but differs in 
the numerical details and our use of Bayesian analy- 
si s, which allows for model c omparison. The method 
in I Vegetti fc KoopmansI (|2008[ ) is also based on Bayesian 
analysis and has adaptive gridding on the source plane. 
In the rest of the section, we briefly outline the theory of 
pixelated potential reconstruction. 

The central concept for this method is to start with an 
initial lens potential model and to correct it, perturba- 
tively and iterativcly, to obtain an estimate of the true 
lens potential. The initial lens potential will usually be 
simply-parameterized (to allow faster convergence with a 
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smaller number of parameters) and ideally would be close 
to the true potential. It will then be refined via correc- 
tions on a grid of pixels. Obtaining the parameter values 
in the initial lens potential is often a nonlinear process; 
in contrast, the potential correction in each iteration is 
a linear inversion. 

One way to think about this procedure is to observe 
that in a perfectly observed image, nested intensity con- 
tours in the source plane map onto multiple regions of 
the image plane. Intensity is preserved by the lens and 
so the map is from a set of single source contours to the 
corresponding image contours. The only freedom that we 
have is to slide image points along the contours. Using 
the fact that the deflection field is curl-free effectively re- 
moves this freedom. What we describe is a procedure to 
determine this map that takes into account a finite PSF, 
dust extinction, and source-intensity contamination by 
the lens galaxy light. In Paper II, we also include the 
influence of propagation effects. 

To keep the formalism simple for the moment, let us 
ignore the effects of the PSF, dust extinction, and lens 
galaxy light. Let 9 be the coordinates on the image plane 
and (3 be the coordinates on the source plane. Let Id{d) 
be the observed image intensity of a lensed extended 
source, and let '4^{9) be an initial scaled surface poten- 
tial model^ for the lens system. Given V'l^'): one can 
obtain the best-fitting source-intensity distribution (e.g., 

ISuvu et all l2006l . and references therein). Let Is{9{(3)) 
be the source intensity translated to the image plane via 
the potential model, i!{0), where 6 and j3 are related via 
the lens equation 9 = f3 — Wijj{9). We define the intensity 
deficit (also known as the image residual) on the image 
plane by 

61(9) = 149) -him). (1) 

Suppose the initial lens potential model is perturbed 
from the true potential, iJjo{9), by 6i/j{9): 



i^{9)^M0)+S^j{9). 



(2) 



For a given image (fixed Id{9)) and the initial potential 
model ijj{9), we can relate the intensity deficit to the 
potential perturbation 5ip{9) by 



5m 



dp 



89 



(3) 



to first order in 5'^{9) (see e.g. JSuvu fc BlandfordI (|2006D 
for details) . The source-intensity gradient dis (/?) / 0(3 im- 
plicitly depends on the potential model 7/;(0) since the 
source position (3 (where the gradient is evaluated) is 
related to i!{9) via the lens equation. We can solve 
Equation ([3|) for Sip {9) given the intensity deficit and 
source-intensity gradients, update the initial (or previ- 
ous iteration's) potential model, and repeat the process 
of source-intensity reconstruction and potential correc- 
tion until the potential converges to the true solution 
with zero intensity deficit. In Section 12.21 we focus on 
solving Equation ([3]). 

^ ip includes the distance ratio. 



2.2. Implementation of pixelated potential 
reconstruction 

2.2.1. Probability theory 

The first step in solving Equation ^ for the poten- 
tial perturbation is to obtain the source-intensity gradi- 
ents and the intensity de ficit, which appear in the correc- 
tion equation. We foUow lSuvu et al.l ()2006l ) to obtain the 
source-intensity distribution on a grid of pixels given the 
current iteration's lens potential model. In this source 
reconstruction approach, the data (observed image) are 
described by the vector dj, where j = 1, . . . , iVd and 
is the number of data pixels. The source intensity is de- 
scribed by the vector Si , where i = 1 , . . . , iVg and Ng is the 
number of source-intensity pixels. The observed image is 
related to the source intensity via dj = ijiSi + rij, where 
fji is the so-called blurred Icnsing operator (mapping ma- 
trix) that incorporates the lens potential (which governs 
the deflection of light rays) and the PSF (blurring), and 
Uj is the noise in the data characterized by the covariance 
matrix Cd. In the inference of Si, we impose a prior on Si, 
which can be thought of as "regularizing" the parameters 
Sj to avoid o verfit ting to the noise in the data. Following 
ISuvu et all (|2006l) . we use quadratic forms of the regu- 
larization (specifically, zeroth-order, gradient, and curva- 
ture forms of regularization). The Bayesian inference of 
the source-intensity distribution (s^) given the observed 
image {dj) is a linear inversion and is a solved problem. 
Having obtained the source intensity, we can calculate 
the intensity deficit and source-intensity gradients. 

We pixelize the lens potential to allow for a flexible 
parametrization scheme. To solve Equation ([3|), we cast 
it into a matrix equation and invert the linear system. 
To write Equation ([3]) in a matrix form, we discretize the 
lens potential on a rectangular grid of iVp pixels (which 
is less than the number of data pixels so that the 
potential and source-intensity pixels are not undercon- 
strained) and denote the potential perturbation by Stpi 
where i = 1, . . . , A^p. The intensity deficit on the image 



grid is 51 j 



fjiSi where j = 1, . . . , (using the 



notation from source-intensity reconstruction, d, f and 
s are the data vector, the blurred lensing operator, and 
the source- intensity vector, respectively). Equation ([3]) 
now becomes 

61 = tSijj + n, (4) 

where t is a x A'p matrix which incorporates the PSF, 
the source- intensity gradient, and the gradient operator 
that acts on dtp (see the appendix for the explicit form 
of t) , and n is the noise in the data. The above equation 
is equivalent to 

d = fs + tSxp + n. (5) 
We can infer the potential corrections Stp given the 
data d, source intensity s, and source- intensity gradients 
that are encoded in t. In the inference, we impose a prior 
on Sip. The posterior probability distribution is 



likelihood 



posterior 



P(5'0|<i,f,s,t,/i,gi-^) = 



P{d\5^p,t,^,s)P{5^p\^i,^s^) 
P(d|f,s,t,/i,g5^) 



evidence 



(6) 
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matrix 



Dust extinction, if present, is also included in this mapping 
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where /i and g^^ are the (fixed) strength and form of reg- 
ularization for the potential correction inversion, and aU 
irrelevant (in)dependences have been dropped. Modeling 
the noise as Gaussian, the likelihood is 



P{d\dxP,t,f,s) 



exp(-^D(d|JV>,t,f, s)) 



(7) 



where 



EB{d\dxp,t,f,s) 



{d-fs- tdip) 



(8) 
(9) 



and Zd is the normalization for the probability. We ex- 
press the prior in the following form: 



P{Sip\n,gs.4,) 



(10) 



We use quadratic forms of the regularizing function Es;(, . 
In particular, we use the curvature form of reg ularization 



(see, for example, Appendix A of ISuvu et al.l ((2006) for 



an explicit expression of the curvature form of regulariza- 
tion) . We use this regularization instead of the zeroth- 
order or gradient forms because the lens potential should 
in general be smooth, being the integral of the surface 
mass density. Curvature regularization in the potential 
corrections effectively corresponds to zeroth-order regu- 
larization in the surface mass density corrections. This 
implies a prior preference toward zero surface mass den- 
sity corrections, thus suppressing the addition of mass to 
the initial mass model unless the data require it. 

Maximizing the posterior of parameters Sxp, we obtain 
the most probable solution 



MP 



(11) 



where 



and V = 



A = B + /xC, 

C = WWEs46iP), 
D^t^C^\d-fs), 

d 



The matrices A, B and C have dimensions A^p x A'p and 
are, by definition, the Hessians of the exponential ar- 
guments in the posterior, the likelihood, and the prior 

probability distributions, respectively. 

A s discussed i i i deta il in, for example, iMacKavl ()1992D 
and ISuvu et al.l (|2006D . the evidence is irrelevant in the 
first level of inference where we maximize the posterior 
of parameters dtp to obtain the most probable param- 
eters dxp^p. However, the evidence is crucial for the 
second level of inference for model comparison, where a 
model incorporates the lens potential, PSF, and regular- 
izations of both the source intensity and the potential 
correction. If we assert that models are equally probable 
a priori, then the evidence gives the relative probability 
of the model given the data. In other words, the ratio 
in the evidence values of two models tells us how much 
more probable the first model is relative to the second 



model, if we assume that the two models are a priori 
equally probable. Since the evidence gives only the rela- 
tive probability, the data set needs to be kept the same 
for model comparison. 

The posterior {P{6ip\d,f,s,t,fi,gg^)) and the evi- 
dence (P(<i|f, s, t, /i, g5^)) in Equation © are condi- 
tional on the source-intensity distribution. Ideally, we 
would have an expression of the posterior for both 
s and 6ip- P(s, <5t/j|<i, f , A, gg, t, /i, g^^), where A and 
gg are, respectively, the strength and form of reg- 
ularization for s. We would also obtain the evi- 
dence by marginalizing both the source-intensity and 
the potential correction values, P(<i|f , A, gg, t, /i, g^.^) = 
/ dsd^i/jP(d|s,^i/),f,t)P(s,5i/)|A,gg,/i, g5^). However, 
due to the iterative nature of the method (i.e., s and Sxp 
are not inferred simultaneously), we do not have such 
expressions for the posterior and the evidence. Prag- 
matically, we use the evidence from the source recon- 
struction (given the corrections Sxp), P{d\f,d'ip,X,gg), 
for comparing the potential models, PSF and regular- 
izations. Specifically, after iterating through the source- 
intensity reconstructions and lens potential corrections, 
we use the final corrected lens potential for one last 
source- intensity reconstruction and use the evidence from 
this final source reconstruction for comparing models. 
This approximation is valid provided that the proba- 
bility distributions of 6rp and the regularization con- 
stant are s h arply peaked at the most probable values. 
ISuvu et"an (120061 ) showed that the delta function ap- 
proximation for the regularization constant is accept- 
able; simulations of the iterative potential reconstruction 
method suggest that the probability of Sip after the final 
iteration is sharply peaked. Therefore, the probability 
of a given potential model, PSF, and form of regulariza- 
tion is P(f,gsl<i) oc / dAd^t/'P(d|f,^V, A,gs)P(f,gs) ~ 
P(d|f , ^i/). A, gg)P(f , gg), where dip and A are the most 
probable solutions. Assuming that all models are equally 
probable a priori (i.e., P(f,gg) is constant), the evidence 
from the source reconstruction serves as a reasonable 
proxy to use for model comparison. 

There is an uncertainty associated with the evidence 
values due to finite source-intensity resolution as a re- 
sult of the source pixelization. The source reconstruction 
region is initially chosen such that the mapped source 
region on the image plane encloses the Einstein ring. 
This ensures that the source region contains the entire 
source-intensity distribution. Throughout the iterative 
pixelated potential reconstruction, the source region and 
pixelization are kept the same. In the final source recon- 
struction for evidence computation, the evidence value 
depends on the pixelated source region because the good- 
ness of fit on the image plane generally changes, espe- 
cially in areas of significant intensity gradients, as one 
shifts the source region. To estimate the uncertainty in 
the evidence values, we perform the last source recon- 
struction for various source regions that are shifted by a 
fraction of a pixel from the optimized one in the potential 
reconstruction. The range of the resulting evidence val- 
ues for the various source regions then allow us to quan- 
tify the uncertainty in the evidence. In addition to the 
uncertainty due to source pixelization, the evidence also 
depends on the amount of regularization on Sxp, which 
is discussed in Section r2. 2.21 
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2.2.2. Technicalities of the pixelated potential 
reconstruction 

Solving for the potential perturbations is very sim- 
ilar to solving fo r the source-intensity distribution in 
ISuvu et"al] ()2006l ) except for the following technical de- 
tails: 

1. In each iteration, the perturbative potential correc- 
tion is obtained only in an annular region instead of 
over the entire lens potential grid due to the need 
for the source-intensity gradient (see Equation ([3])) 
to be measurable. Since the extended source inten- 
sity is only non- negligible near the Einstein ring, we 
only have information about the source-intensity 
gradients in this region. In practice, the annular re- 
gion is the mapping of the finite source reconstruc- 
tion grid that encloses the extended source with 
a minimal number of source pixels (for computa- 
tional efhciency). The annulus of potential correc- 
tions obtained at each iteration is extrapolated for 
the next iteration by minimizing the curvature in 
the potential corrections. This allows the shape 
of the annular region to change as needed when 
the lens potential gets corrected. In addition, the 
forms of the regu larization matr ix, as discussed in 
Appendix A of .Suyu et al.l (|2006l ) , are modified ac- 
cordingly to take into account the nonrectangular 
reconstruction region (described in more detail in 
the third point below). 

2. Since Equation ([5]) is a perturbative equation in 

the inversion needs to be over-regularized to 
enforce a small correction in each iteration. Em- 
pirically, we set the regularization constant, ^, at 
roughly the peak of the function fiEg^ (within a 
factor of 10), which corresponds to the value before 
which the prior dominates. The resulting evidence 
value from the final source-intensity reconstruction 
weakly depends on the value of fj,, and we include 
this dependence in the uncertainty of the evidence 
value. 

3. The potential corrections are generally nonzero at 
the edge of the annular reconstruction region. This 
calls for slightly different structures of regulariza- 
tion compa r ed to those written in Appendix A in 
ISuvu et al] (|2006f ) for source-intensity reconstruc- 
tion (since the source grids are chosen to enclose 
the entire extended source such that edge pixels 
have nearly zero intensities). The regularizations 
are still based on derivatives of Si/j; however, no 
patching with lower derivatives should be used for 
the edge pixels because the zeroth-order regulariza- 
tion at the top/right edge will incorrectly enforce 
the 6ip values to zero in those areas. The absence 
of the lower derivative patches leads to a singular 
regularization matrix, ^"'^ which is problematic for 
evaluating the Bayesian evidence for lens poten- 
tial correction. However, since we do not use the 
evidence values to compare the forms of regulariza- 
tion for the potential corrections (because we use 

Having a singular regularization matrix (C) does not prevent 
one from calculating Sip because the matrix for inversion (A = 
B -|- /iC) is, in general, nonsingular. 



only the curvature form) nor to compare the lens 
potential and PSF model, the revised structure of 
regularization is acceptable. We have found this 
structure of regularization for potential corrections 
to work for various types of sources (with varying 
sizes, shapes, number of components, etc.). 

In the source reconstruction steps of this iterative 
scheme, we discover by using simulated data that over- 
regularizing the source reconstruction in early iterations 
helps the process to converge. This is because initial 
guess potentials that are significantly perturbed from the 
true potential often lead to highly discontinuous source 
distributions when optimally regularized (corresponding 
to maximal Bayesian evidence, which balances the good- 
ness of fit and the prior), and over-regularization would 
give a more regularized source-intensity gradient for the 
potential correction. Unfortunately, we do not have an 
objective way of setting this over-regularization factor for 
the source reconstruction. Currently, at each source re- 
construction iteration, we set the over-regularization fac- 
tor such that the magnitude of the intensity deficit is at 
approximately the same level as that from the optimally- 
regularized case but with a smoother source-intensity dis- 
tribution for numerical derivatives. This scheme ensures 
that we do not over-regularize when we are close to the 
true potential. Based on simulated test runs, the re- 
covery of the true potential depends on the amount of 
over-regularization. When the initial guess is far from 
the true potential, over-regularization in the early it- 
erations is crucial for convergence. We find that it is 
better to over-regularize in excess than in deficit. Too 
much over-regularization simply leads to more iterations 
to converge, whereas too little over-regularization may 
not converge at all. 

For each iteration of source-intensity reconstruction, 
there is also a mask on the source plane to exclude source 
pixels that either (1) are not mapped by that iteration's 
lens potential on the data grid or (2) have no neighbor- 
ing pixels for the computation of numerical derivatives. 
We generalize the regularizing function for this nonrect- 
angular reconstruction region to have the right-most and 
top-most pixels (pixels adjacent to the edge or adjacent 
to the masked source pixels) patched with lower deriva- 
tives as we did f or the edge pixels in Appendix A of 
ISuvu et all ()2006( ). This patching ensures that the regu- 
larization matrix is nonsingular for the evaluation of the 
Bayesian evidence. 

Based on simulated test runs, we find that a practical 
stopping criterion for the iterative procedure is to termi- 
nate when the relative potential corrections between all 
image pairs are {Sipi — <5V'2)/('0i ~ ''P2) < 0.1%, where 1 
and 2 label the images in any pair. After this criterion is 
reached, further iterations give a negligible contribution 
to the predicted Fermat potential differences between the 
images. 

2.2.3. Mass-sheet degeneracy 

The restriction to using only isophotal data implies 
that the potential correction we obtain at each itera- 
tion may be affe cted by the "mass-sheet degeneracy" 
(|Falco et al.lll985[ ). However, the addition of mass sheets 
is suppressed by the curvature form of the regularization 
for the potential correction and also by the large amount 
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of ov er-regularization. We refer to iKochanek et al.l 
(|2006D and Paper II for a detailed description of the mass- 
sheet degeneracy; here we review a few key points that 
are relevant for the potential corrections. In essence, an 
arbitrary symmetric paraboloid, gradient sheet, and con- 
stant can be added to the potential without changing the 
predicted lensed image: 

Md)^^^-^^ + a-e + c + v^{e), (12) 

where V'l^') is the original potential, V'!^(6') is the trans- 
formed potential, and v, a, and c are constants. The 
constants a and c have no physical effects on the lens 
systems as they merely change the origin on the source 
plane (which is unknowable) and change the zero point of 
the potential (which is not observable). The parameter 
1 ~ v refers to the amount of mass sheet, which can be 
seen in the corresponding convergence transformation: 
Kv{9) = (1 — v) + vk{9). To make sure we remain "close" 
to the initial potential model, we set v = \ and fix three 
points in the corrected potential after each iteration to 
the corresponding values of the initial potential. Setting 
V — \ ensures that the size of the extended source in- 
tensity remains approximately the same, and the three 
fixed points allow us to solve for a and c in Equation (fT2]l 
to remove irrelevant gradient sheets and constants in the 
reconstructed potential. We choose the three points to 
be three of the four (top, left, right, and bottom) loca- 
tions of the annular reconstruction region that are mid- 
way in thickness between the annular edges. The three 
points are usually chosen to be at places with lower sur- 
face brightness in the ring. This technique of "fixing" the 
mass-sheet degeneracy is demonstrated in Section 12.2.41 
using simulated data. 

2.2.4. Summary 

To summarize, the steps for the iterative and perturba- 
tive potential reconstruction scheme via matrices are as 
follows. (1) Reconstruct the source-intensity distribution 
given the initial (or corrected) lens potential based on 
ISuvu et al] (|2006D . (2) Compute the intensity deficit and 
the source-intensity gradient. (3) Solve Equation ([5|) for 
the potential corrections dxj} in the annulus of reconstruc- 
tion. (4) Update the current potential using Equation 

0- V'next iteration = ''/'current iteration ~ ■ (5) Trans- 
form the corrected potential ■0n(,xt iteration '^i^ Equation 
so that u = 1 and the transformed corrected poten- 
tial has the same values as the initial potential at the 
three fixed points. (6) Extrapolate the transformed cor- 
rected potential for the next iteration. (7) Interpolate 
the transformed corrected potential onto the resolution 
of the data grid for the next iteration's source reconstruc- 
tion. (8) Repeat the process using the extrapolated and 
finely gridded reconstructed potential, and stop the pro- 
cess when the relative potential correction between any 
pair of images is < 0.1%. 

3. DEMONSTRATION: POTENTIAL PERTURBATION DUE 
TO AN INVISIBLE MASS CLUMP 

In the previous section, we have outlined a method of 
pixelated potential reconstruction. In this section, we 
will demonstrate this method using simulated data with 
a lens consisting of two mass components. 



3.1. Simulated data 

We use singular iso thermal ellipsoid (SIE) potentials 
(|Kormann et al.lll994f ) to test the potential reconstruc- 
tion method. For this demonstration, we let the lens be 
comprised of two SIEs at the same redshift Zd = 0.3: a 
main component and a perturber. The main lens has 
a one-dimcnsional velocity dispersion of 260kms~^, an 
axis ratio of 0.75, and a semi-major axis position an- 
gle of 45° (from vertical in the counterclockwise direc- 
tion). The (arbitrary) origin of the coordinates is set 
such that the lens is centered at (2.5", 2.5"), the center 
of the 5" X 5" image. The perturbing SIE is centered at 
(3.8", 2.5") with a velocity dispersion of 50kms^^, axis 
ratio of 0.60, and semimajor axis position angle of 70°. 
The exact potential is the sum of these two SIEs. We 
model the source intensity as an elliptical distribution 
inside the caustics at Zg = 3.0 with an extended com- 
ponent (of peak intensity of 1.0 in arbitrary units) and 
a central point source (of intensity 3.0). This source is 
chosen such that the lensed image resembles B1608-t-656. 
We use 100 x 100 image pixels each of size 0.05" (typi- 
cal pixel size of the //^ST" Advanced Camera for Surveys 
(ACS)), 30 X 30 source pixels each of size 0.025", and 
25 X 25 potential pixels each of size 0.2". To obtain the 
simulated data, we map the source-intensity distribution 
to the image plane using the exact lens potential and the 
lens equation, convolve the lensed image with a Gaussian 
PSF whose FWHM = 0.15" and add Gaussian noise of 
variance 0.015. Fig. [T] shows the simulated source in the 
left-hand panel and the simulated noisy data image in the 
middle panel. The Fermat potential difference between 
the images are listed in Table [T] The images are labeled 
by A, B, C, D, and their locations are (1.77", 1.02"), 
(3.90", 3.59"), (3.54", 1.26"), and (1.34", 3.38"), respec- 
tively. 

3.2. Iterative and perturbative potential corrections 

We take the initial guess of the lens potential to be 
the main SIE component but with the position angle 
changed from 45 to 40°. This corresponds to a typical 
scenario where the perturbing SIE is faint/dark so that 
it is not detected in the image, and hence is not incorpo- 
rated in the smooth parametrized model of the main SIE 
component. The rotation in the position angle of the 
main SIE component corresponds to a situation where 
the mass of the galaxy does not strictly follow the light, 
but the position angle of the lens mass distribution is ini- 
tially adopted from the position angle of the lens galaxy 
light. Here and after, "initial potential" refers to this 
initial guess of the potential model (as opposed to the 
true/exact potential). Fig.[T]shows the potential pertur- 
bation relative to the initial potential in the right-hand 
panel. In obtaining this plot, the initial potential has a 
constant gradient plane and offset added such that the 
top, left, and bottom midpoints in the annulus (marked 
by Xs in the plot) are fixed to the true potential with 
zero potential perturbation (as described in the passage 
following Equation (fT2|) ). In the iterative potential re- 
construction process, the reconstructed potential at each 
iteration also has these three points in the annulus fixed 
to the initial model. The locations of the three fixed 
points have no impact on recovering the true potential 
when the source is extended enough to form an Einstein 
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Fig. 1. — Demonstration of potential reconstruction: simulated data and potential perturbation. Left-hand panel: the simulated source- 
intensity distribution with an extended component (of peak intensity of 1.0 in arbitrary units) and a central point source (of intensity 3.0) 
on a 30 X 30 grid. The solid curves are the astroid caustics of the initial potential that consists of only the main SIE. Middle panel: the 
simulated image of the source-intensity distribution on the left using the true potential consisting of two SIEs (convolution with Gaussian 
PSF and addition of noise are included, as described in the text). The solid line is the critical curve of the initial potential and the dotted 
lines mark the annular region to which the source grid maps (using the mapping matrix f). Right-hand panel: the fractional potential 
perturbation in the initial potential model. The Xs mark the three points where we fix the potential perturbation to zero. In both the 
middle and right-hand panels, the asterisk and the plus sign indicate the positions of the main SIE component and the perturbing SIE 
component, respectively. 



TABLE 1 

The relative Fermat potential [(f> = [0 ~ /3)^/2 — V) between the four images of the true potential and of the 

RECONSTRUCTED POTENTIAL FOR, A FEW SELECTED ITERATIONS 

Source Position 

Potential ipAB 4'CB <f>DB (arcsec) 

True 0.141 0.234 0.437 ~. 

Initial 0.172 ±0.189 0.228 ±0.156 0.437 ± 0.041 (2.587, 2.483) ± (0.013, 0.076) 
lteration=0 0.178 ±0.070 0.246 ± 0.068 0.479 ± 0.010 (2.608, 2.483) ± (0.006, 0.034) 
Iteration=2 0.161 ±0.011 0.242 ±0.010 0.471 ±0.011 (2.623, 2.484) ± (0.005, 0.005) 
Iteration=9 0.151 ±0.006 0.244 ±0.004 0.454 ± 0.006 (2.621, 2.484) ± (0.003, 0.002) 

V 0.96 
Iteration=9 0.145 ±0.006 0.234 ±0.004 0.436 ± 0.006 



Note. — We use the average source position of the four source positions for the computation of the Fermat potential. The four source 
positions deviate by ~ 0.1" in the initial model, and agree within ~ 0.005" at iteration=9. The uncertainties in the predicted relative 
Fermat potential are due to the uncertainties in the source position. The good agreement between the predicted Fermat potential values 
for the initial potential and the true values is coincidental due to the use of the average source position. 



ring on the image plane. However, if the source is com- 
pact, then locations of the three points do matter and 
they are chosen to be at places where the information 
content (image intensity) is low. 

We perform 10 iterations of the perturbative poten- 
tial correction method outlined in Section 12.21 The it- 
erations are labeled "PI" from to 9. For each source 
reconstruction iteration, we adopt the curvature form of 
regularization and use the source-intensity reconstruc- 
tion for the evaluation of the source-intensity gradients 
that are needed for the potential correction. The source 
inversions are over-regularized in early iterations in or- 
der to obtain smooth source reconstructions for evalu- 
ating the gradients. For each potential correction iter- 
ation, we use the curvature form of regularization, and 
set the regularization constant for the potential recon- 
struction to be 10 X the value of fx where y^Eg^ peaks in 
iteration=0. This regularization value is ~ 10® and is 
used for all subsequent iterations (since we find that the 
peak in ijlEs^ changes little as the iterations proceed). 
For comparison, the "optimal" regularization constant is 
~ 10^ at iteration=0 and is ^ 10'' at iteration=9. There- 
fore, the potential reconstruction inversions are heavily 



over-regularized in the early iterations to keep the correc- 
tions to first order; as the lens potential gets corrected, 
the amount of over-regularization diminishes as the in- 
version approaches the linear regime with small intensity 
deficits. We show figures of source reconstructions and 
potential corrections for some, but not all, of the itera- 
tions. 

The top row of Fig. [2] shows the results of PI=0. The 
over-regularized reconstructed source in the left-hand 
panel does not resemble the original source, and the (nor- 
malized) image residual in the middle-left panel shows 
prominent arc features due to the presence of both the 
misaligned initial model and the SIE potential perturba- 
tion. The reconstructed dxjj in the middle-right panel is 
of the same structures as the exact Stj} in Fig. [U though 
the magnitude is smaller due to the correction being 
a perturbative one. A plot of the image residual after 
correction (= 81 — tSxp) continues to show arc features 
though less prominent than in the top middle-left panel 
in Fig. [21 The same image residual plot with the true 
potential perturbation also shows similar arc features, 
which indicates that Equation ([3]) is indeed a perturba- 
tive equation and thus justifies the over-regularization in 
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the potential correction step. 

The second row of Fig. [2] shows the results of PI=2. 
The reconstructed source in the left-hand panel better re- 
sembles the original source in Fig. [TJ The amount of mis- 
fit in the image residual has decreased in the middle-left 
panel, signaling that we are correcting toward the true 
potential. The middle-right panel is the potential cor- 
rection in PI=2, and the right-hand panel is the amount 
of perturbation that remains after PI=2. The amount of 
potential perturbation remaining is closer to zero com- 
pared to the top row, which is a sign that the iterative 
method converges. 

The bottom panels in Fig. [2] show the results of PI=9, 
the last iteration. The source is faithfully recovered in 
the left-hand panel, resulting in negligible image resid- 
ual in the middle- left panel (reduced = 1-02 inside the 
annulus^^). The centroid of the source is slightly shifted 
compared to the original because of our adding constant 
gradients to fix the three points in the potential correc- 
tions. The absolute position of the source is irrelevant 
as we can arbitrarily set the coordinates; it is only the 
relative positions on the source plane that matter. The 
source positions are shifted relative to the plotted caustic 
curve only because these caustic curves are the ones from 
the initial potential guess (they were not computed for 
the reconstructed potential due to the low resolution in 
the reconstructed potential grids). If we were to plot the 
caustic curve of the corrected potential, we would find 
no overall shift in the source with respect to the caustic 
curve. The middle-right panel shows the final iteration's 
potential correction, which is barely visible due to the 
negligible image residual left to correct. The right-hand 
panel shows that most of the potential perturbation to 
the true potential has been corrected, though there is 
still some left. However, this amount of remaining un- 
corrected potential perturbation leads to image residuals 
that are effectively masked by the noise in the data. We 
have thus reached the limit in the potential correction 
that is set by noise in the data. 

Table [T] lists the predicted Fermat potential differences 
for the initial potential guess and for the corrected po- 
tential in PI=0, 2, and 9. We use the average source po- 
sition (also listed in the table) of the four mapped source 
positions for the computation of the Fermat potential. 
The uncertainty in the predicted Fermat potential dif- 
ference comes from the error in the source position due 
to discrepancies in the mapped source positions of the 
four images. The mapped source positions agree within 
^ 0.005" (i.e., within a fifth of a source pixel) in the 
final iteration, a significant improvement to ~ 0.1" in 
the initial potential. The convergent Fermat potential 
differences in PI=9 are systematically higher than the 
true Fermat potential differences. This is because lens- 
ing only allows us to recover the Fermat potential dif- 
ferences up to a constant factor due to the mass-sheet 
degeneracy. The transformation in Equation (jl2[) would 
scale the Fermat potential difference by a factor of v. 
The last row in Table [1] shows that a mass-sheet trans- 
formation with V = 0.96 leads to the predicted Fermat 

^'^ The reduced is given by x^/i^p ix in annulus ~ 7)i where 
^pix in annulus is the number of data pixels in the annulus that 
encloses the ring and ■y is an estimate of the num ber of "effective" 
parameters (e.g. IMacKa^lTOQ^: ISiivu et al.ll200l) . 



potential values agreeing with the true values within the 
uncertainties. We expect this particular simulation's re- 
constructed pixelated potential to be different from the 
true potential by a mass-sheet transformation of ~ 0.96 
due to the unaccounted mass of the secondary SIE (~ 4% 
of the primary SIE) in the initial model. In the itera- 
tive potential corrections, mass additions are suppressed 
in the annulus due to the regularization. This breaks 
the mass-sheet degeneracy, but underestimates the total 
mass within the annulus (the SIE perturber was not in- 
cluded in the initial model): the reconstructed potential, 
therefore, continues to have a deficit of mass in the annu- 
lus. Since the value of the convergence in the annulus is 
generally less than 1, the reconstructed potential is thus 
approximately a mass-sheet transformation of the true 
potential with the mass deficit in the form of a constant 
sheet. 

The simulation we have shown is one of the worst-case 
scenarios where even the total mass of the initial lens 
model enclosed within the Einstein ring is wrong. For 
initial potential models that have the correct amount of 
mass within the Einstein ring (this enclosed mass is what 
lensing can robustly measure to ~ 1% — 2% accuracy in 
real systems) and with the mass-sheet degeneracy broken 
(using external information such as stellar dynamics), 
the reconstructed potential would faithfully recover the 
Fermat potential. 

3.3. Discussion 

This demonstration shows that the iterative and per- 
turbative potential reconstruction method works in prac- 
tice. Using simulated data, we find that potential per- 
turbations < 5% (which may correspond to as much as 
~ 20% in the relative potential perturbations between 
image pairs) are correctable, though the actual amount 
depends on the amount of over-regularization for both 
the source inversion and the potential correction, and on 
the extendedness of the source-intensity distribution. In 
the case where the solution converges, the magnitudes 
of the relative potential corrections between image pairs 
steadily decrease, and we end the iterative procedure 
when the stopping criterion (described in Section 12. 2p 
is met. 

Regarding the size of the source-intensity distribution, 
the more extended a source is, the better we can recover 
the potential. When the source is extended enough to 
be lensed into a closed ring, the true potential can be 
fully recovered (up to the limit set by the noise in the 
data) from potential corrections based on Equation ([5]). 
When the source is extended to cover about half of the 
Einstein ring, then the corrected potential faithfully re- 
produces the source with negligible image residual, but 
the relative Fermat potentials may not be recovered due 
to a slight relative offset in the potential between the 
images. This is because the "connecting characteristics" 
(see Suyu & Blandford (2006)) that fix the potential dif- 
ference between the images go through regions without 
much signal (light of the lensed source). Therefore, the 
potential is locally corrected at regions near the images 
(where there is light), but the global offset between the 
regions cannot be determined. 

For sources that are small in extent, the potential cor- 
rection also depends on the points we choose to fix to the 
initial potential model. Since an isolated image is gener- 
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Fig. 2. — Demonstration of potential reconstruction: results of source-intensity reconstruction and potential correction for iteration = 
0, 2, and 9. The top row shows the results for PI=0. Left-hand panel: the reconstructed source intensity using curvature regularization 
that is over-regularized to ensure a smooth resulting source for evaluation of the gradients. The caustic curves in solid are those of the 
initial potential. Middle-left panel: the normalized image residual (difference between the simulated image and the predicted image from 
the reconstructed source in the left-hand panel, in units of the estimated pixel uncertainty from the data image covariance matrix). The 
prominent arc features are due to the potential perturbation. Middle-right panel: the reconstructed Srp using the source-intensity gradients 
and image residual. Right-hand panel: the amount of potential perturbation that remains to be corrected. The middle and bottom rows 
show the results for PI=2 and PI=9, respectively, with the panels arranged in the same way as in the top row. As the iterative potential 
correction proceeds, the source resembles better the original source in Fig. [T] the image residual becomes less prominent, and the magnitude 
of the reconstructed Srp decreases. At PI=9, the source in the left-hand panel has been faithfully reconstructed that results in negligible 
image residual in the middle-left panel. The remaining potential perturbation in the right-hand panel, now close to zero, cannot be fully 
corrected due to the noise in the data. 

from the source reconstruction in principle can be used 
to compare the different potential grids. In general, we 
find that a potential grid that is ^ 4 times coarser than 
the image grid works well. 

In Section 21 we generalize this iterative potential re- 
construction method, which has been shown to work on 
simulated data, to treat real gravitational lens images 
such as B1608+656. 



ally more prone to having its potential be offset relative 
to the other images, we set two of the three fixed points 
in the gaps on both sides of the most isolated image and 
one point near the connecting images. 

We find that a wrong PSF model (e.g., of a differ- 
ent width) would lead to intensity deficit that would not 
be correctable by the iterative potential reconstruction 
method. Therefore, an uncorrectable image residual is a 
sign that our model of the system (other than the lens 
potential) is wrong. 

The potential grid that we used was 25 x 25, which we 
find to be a good balance between the number of degrees 
of freedom and goodness of fit. The higher the number 
of potential pixels, the better one can fit to the image 
residual; however, in this case, it is also more proba- 
ble to have degenerate solutions. The Bayesian evidence 



4. GENERALIZATION TO REALISTIC DATA: 
INCORPORATING DUST EXTINCTION AND LENS 
GALAXY LIGHT 

In the previous section, we have demonstrated the 
method of pixelated potential reconstruction using sim- 
ulated data. In the mock data, only the image of lensed 
source was there; in reality, there would also be light 
from the lens galaxy. Furthermore, in some cases, such 
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as B1608+656, dust is present and absorbs light from 
both the source galaxy and the lens galaxy. Based on 
results of the previous section, an accurate extraction 
of the light from the lensed extended source is crucial 
for reconstructing the lens potential. Therefore, we will 
generalize the formalism given in Section[2]to incorporate 
the lens galaxy light and dust. 

Suppose that we have a set of PSF, dust, and lens 
galaxy light models (the process of obtaining these mod- 
els is described in detail in Section (5]), a lens potential 
model, and the observed image. Separating the observed 
image into two components, the lensed source and the 
lens galaxy, we can model the observed image (as a vec- 
tor for the intensities of the image pixels) as 

lensed extended source lens galaxy 

d= B K L s + B~1r7 + n, (13) 

where B is a PSF blurring matrix, K is a dust extinction 
matrix, L is the lensing matrix (containing the lens po- 
tential model), s is the source- intensity distribution, I is 
the lens galaxy intensity distribution, and n is the noise 
in the data characterized by the covariancc matrix Cd- 
T his is an extended version of the equation d = fs + n 
in iSuvu et all (|2006[ ) with f replaced by B • K • L and d 
replaced by d — B • K • L The order of the matrix products 
in both terms are obtained by tracing backwards along 
the light rays: we first encounter the PSF blurring from 
the telescope (B), then dust extinction (K) in the lens 
plane, then the strong lensing effects (L) in the case of 
the lensed source, and finally the origin of light (s or I). 

Here we assume that the dust lies in a screen in front of 
the lensed source and the lens galaxy. This assumption 
is not strictly valid for the lens galaxy if the dust we re to 
have originated from G2 (|Surpi fc Blandfordl[200l ). In 
this case, the dust and stars are mingled together in the 
lens galaxy. It is beyond the scope of this paper to treat 
this mixed light and dust problem. However, we note 
that the dust screen assumption is acceptable since the 
aim is to obtain an accurate lensed source-intensity dis- 
tribution (for which the dust screen assumption is valid) 
and not the lens galaxy intensity distribution near the 
core where the mixing effects would dominate. Further- 
more, in simple toy models, where either the dust and 
stars are uniformly mixed or the dust is a screen lying 
inside the lens galaxy, we find that the extinction of the 
lens galaxy light is well approximated as extinction by a 
foreground dust screen with a reduced visual extinction. 
Our simple foreground dust screen model thus provides 
an effective extinction that incorporates the reduced ex- 
tinction for the lens and the full extinction by a fore- 
ground dust screen for the lensed source. 

If the lensed source contains a bright core such as an 
active galactic nucleus (AGN), then we could consider 
extending Equation (jl3p and model the observed image 
as 

-^images 

d=B K L s+ K,a,PSF(^,)-f B-K-« + n, (14) 

1=1 

where the light from the extended part of the host (the 
first term) would be modeled separately from that from 
the point sources (the second term), and ai are the in- 
tensities (fiux per unit solid angle in a pixel) of the point 



sources (which are generally not the same for all im- 
ages due to finite resolution — both lensing and microlens- 
ing give rise to different magnification of the point-like 
source — and, in the case of a time- varying core, time de- 
lay difference). However, it is the extended image surface 
brightness that provides the information needed to recon- 
struct the lens potential. For B1608-I-656, by taking into 
account the errors in the modeling associated with the 
presence of the point sources (see Section fS. 2. ip . we will 
find that a separate modeling of the point sources is not 
necessary for reconstructing the lens potential. 

Given B, K, I, L and d, one can solve for the most prob- 
able s ource-intensity distribution smp, as in iSuvu et al.l 
d2006 f) . Furthermore, one can use the Bayesian evidence 
of the source reconstruction to rank different models of 
PSF, dust extinction, lens galaxy light, and lens potential 
(see Section r2.2.ip . When we compare models, we mark 
an annular region enclosing the Einstein ring and use the 
same annulus of data for all models (where models refer 
collectively to the lens potential, PSF, dust, lens galaxy 
light, and regularization) . For the chosen data set, we 
determine the source region that maps to the annular 
region and reconstructs the source intensities in this re- 
gion. The shape of this source region is generally not 
rectangular, so we generalize the regularization schemes 
in Appendix A of [Suyu et al. (2006) to patch the right- 
most and top- most pixels (pixels adjacent to the edge 
of grid or adjacent to the unmapped source pixels) with 
lower derivatives. We will use the Bayesian evidence val- 
ues from the source reconstruction in Sections [5] and [H] 
to compare various PSF, dust, lens galaxy light and lens 
potential models for B1608-f 656. 

To include the effects of galaxy light and dust in the 
pixelated potential reconstruction method, we incorpo- 
rate K and I into Equation (O as in Equation and 
include K into t (see the Appendix for this inclusion). 
After these adjustments, we can iteratively correct for 
the lens potential in real systems given a PSF, a dust, 
and a lens galaxy light model based on the machinery we 
developed in the previous sections. 

To conclude, we have outlined and demonstrated an 
iterative and perturbative potential correction scheme 
where the accuracy in the reconstruction is limited by 
the noise in the data. The inputs for this method are an 
initial guess of the lens potential as well as assumptions 
regarding the PSF, dust, and lens galaxy light. The out- 
puts are the reconstructed potential on a grid of pixels, 
the reconstructed source-intensity distribution, and the 
Bayesian evidence from source reconstruction, given the 
assumptions. Our goal is to apply this method to the 
well-observed lens system B 1608+656, and we begin by 
describing our _H5'Tobservations of B1608-I-656 in Section 

El 

5. IMAGE PROCESSING OF B1608+656 

5.1. HSTobservations of B1608+656 

B1608+656 was observed with the ACS camera on 
HSTiii the F606W and F814W filters in 2004 August 
(Proposal 10158; PI:Fassnacht), specifically to get high 
signal-to-noise ratio (S/N) images of the lensed source 
emission. Table [2] summarizes the observations. Each 
orbit of the ACS visits consisted of one four-exposure 
dither pattern in either F606W or F814W through the 
Wide Field Channel (WFC). We used the same dither 
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pattern described in lYork et aTl (|2005D to permit driz- 
zling to a higher angular resolution than the default ACS 
CCD pixel size (~ 0.05"). This subpixel scale is espe- 
cially important for characterizing the PSF. 

In order to correct for the dust extinction in the lens 
system, we also include the Near Infrared Camera and 
Multi-Object Spectrometer 1 (NICMOS) F160W images 
(Proposal 7422; PLReadhead). Details of the NICMOS 
observations are also listed in Table [21 

The ACS images of B1608-I-656 are presented in Fig. [3] 
and show the two Icnsing galaxies and the presence of 
a dust lane through the system. We need to correct for 
both the dust lane and the light from the lens galax- 
ies, which can affect the isophotes of the Einstein ring 
of the extended lensed source. Before we can determine 
the amount of extinction, we need to first unify the res- 
olutions of the images in different wavelength bands due 
to PSF dependencies. This requires PSF modehng, de- 
convolution, and reconvolution for images. Having uni- 
fied the resolutions of the images, we can determine the 
intrinsic colors of the various components (lens galax- 
ies, lensed source galaxy, AGN at the core of the source 
galaxy) in the system that are required for the dust cor- 
rection. After correcting for dust, we can then determine 
the light profiles of Gl and G2 by fitting them with Sersic 
profiles (/(r) c>c exp(— (r/a)^/") where r is the radial co- 
ordinat e, a is a scale length, and n is known as the Sersic 
index; (jSersidlTOBSf )). It is only at this stage, with the 
PSF, dust map, and lens galaxies' light profiles, that we 
can recover the lensed Einstein ring surface brightness 
distribution for lens potential modeling. 

To execute the above plan of attack, in Section 15.21 
we begin by describing the drizzling process for the ACS 
images that are used for the analysis. In Sections [531- 
15.51 we present a suite of PSF, dust, and lens galaxies' 
light models and describe in detail how they are obtained. 
Finally, in Section [5. 6[ we compare these models. 

5.2. Image drizzling 

In the following subsections, we briefiy describe the 
drizzling process for combining the dithered ACS images 
and discuss the alignment of the NICMOS image to the 
ACS image. 

5.2.1. ACS image processing 

The A CS data were reduced using the multidrizzle 
package (jKoekemoer et aLll2002i ) in an early version of 
the HAGGLeS image-processing pipeline (P. J. Marshall 
et al. 2009, in preparation), producing drizzled images 
with a 0.03" pixel scale. The drizzled ACS images are 
shown in Fig. [3l The corresponding output weight im- 
ages from multidrizzle give the values for the inverse vari- 
ance of each pixel. We approximate the noise covariance 
matrix as diagonal and use the variance pixel values for 
the diagonal entries, even though drizzling will correlate 
the noise between adjacent pixels. It is assumed that the 
effect of drizzling can be modeled as having a diagonal 
covariance matrix with the diagonal elements rescaled 
(jCasertano et al.ll2000f) . In practice, we do not need to 
do the rescaling because the ranking of the models us- 
ing the relative log evidence values from the source re- 
construction is insensitive to rescaling of the covariance 
matrix. 



A pixelated representation of a continuous intensity 
distribution generally introduces error in the interpolated 
intensity values between pixels, especially for intensity 
distributions with sharp features. This error should be 
incorporated into the likelihood function. Therefore, for 
modeling the source-intensity distribution on a grid (in 
Sections 15.61 and [S]), we also include the error due to 
pixelization on the image and source planes (which we 
call "regridding error") in the image covariance matrix. 
We express the regridding error on the image plane in 
terms of the data (instead of on the source plane and 
transforming it to the image plane) in order to obtain 
a noise map that is independent of the pixelated lens 
modeling. The regridding error associated with pixel i is 

KHd)^ = Y^M.^ L ^^' (15) 

j G pixels 
adjacent to i 

where is the lensing magnification at pixel i, A/3 is 
the source pixel size, AO is the image pixel size, A^adj 
is the number of pixels adjacent to pixel i, and di (dj) 
is the image intensity at pixel i (j). The summation 
divided by A0^ in the above equation is a conservative 
estimate on the error due to pixelization on the image 
plane. Since sharper features in the image have larger 
gradients (hence, larger values for the summations), the 
regridding error is higher in these areas by construction. 
The remaining quantities in the equation, ^iA/3^/12, ac- 
count for the uncertainty in the predicted image (the 
source image mapped to the image plane) due to the pix- 
elization of the source-intensity distribution. The factor 
1/12 is the second moment of a uniform distribution be- 
tween —0.5 and 0.5. When one constructs the predicted 
image by mapping each image pixel to the source plane 
and reading off the source-intensity value, the mapped 
source position (of an image pixel) is generally not cen- 
tered on a source pixel, but have on average a (l/vT2)- 
pixel shift from the center of the source pixel. Therefore, 
Af3/VT2 is the effective size of the source pixel, which is 
then magnified by (on average) due to lensing. In 
the pixelated potential reconstruction, we approximate 
the magnification at each image pixel (which requires the 
second derivative of the potential) by the value computed 
from the initial potential because (1) the approximation 
enforces the regridding error to be independent of the 
pixelated potential modeling and (2) the corrected po- 
tential values are obtained on an annular region only a 
few pixels thick. Having obtained an estimate for the 
regridding error, we add it in quadrature to the variance 
from the weight image to obtain the entries of the ap- 
proximated diagonal covariance matrix. 

The inclusion of the regridding error is important for 
source-intensity reconstructions with sharp intensity fea- 
tures (such as the presence of a bright core); it has the 
effect of stabilizing the evidence values with respect to 
choices in the source pixelization. Without including the 
regridding error, a pixelated description of, for example, 
a source-intensity distribution with a bright core would 
be highly sensitive to the centering of the core on the 
source pixels. A small mismatch could create large im- 
age residuals near the cores that would veto an otherwise 
good lensing model, which has the rest of the extended 
features well described. Such an undesirable effect is 
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TABLE 2 

HSTOBSERVATIONS OF B1608+656 



Proposal 


Proposal 


Date Instrument 


Filter 


Exposures Exposure Time 




IJJ 








(s) 


C. Fassnacht 


10158 


2004 Aug 24 ACS/WFC 


F606W 


4 


609 










4 


646 








F814W 


4 


632 










4 


646 






2004 Aug 25 ACS/WFC 


F606W 


8 


609 










8 


646 








1 814W 


8 


632 










8 


646 






2004 Aug 29 ACb/WtC 


1 dOdW 


4 


609 










4 


646 








F814W 


4 


632 










4 


646 






2004 Sept 17 ACS/WFC 


F606W 


4 


609 








F814W 


4 


632 










4 


646 










4 


646 


A. Readhead 


7422 


1998 Feb 7 NICl 


F160W 


5 


3840 










1 


2048 










1 


896 



mostly removed by the inclusion of the regridding er- 
ror. For B1608+656, the ratio of the regridding error to 
the error from the multidrizzle weight image is around 
~ 30 near the image centroids and ~ 1 in other parts in 
the Einstein ring. 

5.2.2. NICMOS image processing 

The NICMOS F16 0W image was taken from 
iKoopmans et all (|2003D . Drizzled images on rectangu- 
lar grids for different instruments are generally not on 
the same resolution and not aligned. This is the case for 
the NICMOS and ACS images. We use SWarp^^ to align 
the combined NICMOS image to the ACS images. The 
final SWarped NICMOS F160W image with 0.03" pixel 
scales is shown in Fig. 

5.3. PSF modeling 

In this subsection, we describe the procedure for ob- 
taining the PSFs for each of the ACS and the NICMOS 
data sets. 

5.3.1. ACS PSF 

T he ACS PSF is both spatially and temporally varying 
(e.g. [Rhodes et al.ll2007[ ). One' source of temporal vari- 
ation is the "breathing" of the telescope while it orbits, 
which causes the focal length (and, hence, the PSF) of 
the telescope to change. Instead of adopting a universal 
PSF, we take the approach of modeling several PSFs us- 
ing different means, and quantitatively comparing them 
using the Bayesian analysis described in Section 12.2.11 
This has the advantage of using the data (the observed 
image) to rank the models. For each of the two drizzled 
ACS images, we create five model s for the PSF eit her 
based on the TinyTim package (Kri st fc HooklfTQQTf ) or 
from the unsaturated stars in the field: (1) drizzled PSF 
(" PSF-drz") from a se t of TinyTim simulations (follow- 
ing |RhodiseOni2Q03) , (2) single (nondrizzled) TinyTim 

A package developed by Emmanuel Bertin at Institut 
d'Astrophysique de Paris for resampling and coadding together 
FITS images. 



PSF ("PSF-f3") with a telescope focus value of -3, (3) 
the closest star ("PSF-C") located at ~ 9" in the north- 
east direction from B1608-I-656 in the drizzled ACS field 
with a Vega magnitude of 21.3 in F814W, (4) bright star 
#1 ("PSF-Bl") that is located at ~ 1.9' southwest of 
B1608-I-656 in the drizzled ACS field with a Vega mag- 
nitude of 18.7 in F814W, and (5) bright star #2 ("PSF- 
B2") that is located at - 1.6' south of B1608+656 in the 
drizzled ACS field with a Vega magnitude of 19.1. 

The TinyTim frame (s) were drizzled and resampled to 
pixel sizes of 0.03" to match the resolution of the ACS im- 
ages. We keep in mind that the TinyTim PSFs (PSF-drz 
and PSF-f3) may be insufficient due to the time-varying 
nature of the PSF and the aging of the detector since the 
TinyTim code was written. We expect the closest star 
to B1608+656 (PSF-C) to be a good approximation to 
the PSF because the spatial variation of the PSF across 
~ 9" should be negligible and any temporal variations 
are the same as in the lens system. However, this closest 
star is not bright enough to see the secondary maxima in 
the PSF, so we additionally include two of the brightest 
stars in the drizzled field mentioned above. For each of 
the stars in F606W and F814W, we make a small cutout 
around the star (25 x 25 pixels for PSF-C, 51 x 51 pixels 
for PSF-Bl, and 41 x 41 pixels for PSF-B2) and center 
it on a 200 x 200 grid, which is the size of the drizzled 
science image cutouts of B1608+656 that arc used for 
the image processing. 

5.3.2. NICMOS PSF 

The NICMOS PSF is thought to be more stable, and 
thus we assume a TinyTim model for it. The output 
TinyTim PSF is in the CCD frame of NICMOS with 
pixel size 0.043". As with the F160W science image, the 
PSF was SWarped to be aligned with the ACS images 
with 0.03" pixels. Since there is only one PSF model for 
NICMOS, PSF specifications throughout the rest of this 
paper refer to the ACS PSFs. 

5.4. Dust correction 
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Fig. 3.— Left-hand (right-hand) paneh drizzled HSTACS F606W {F814W) images with 0.03" pixels from 9 (11) HSTorhits. The dust 
lane and interacting galaxy lenses are clearly visible. The white dots indicate the centroid positions of the images. 
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Fig. 4.— HSTmCMOS F160W image that is SWarped to 
aligned to the ACS frame with a 0.03" pixel size. The white dots 
indicate the centroid positions of the images. 



With observations in two or more wavelengths, we 
can correct for the dust extinction using empirical 
dust extinctio n laws . We adopt the extinction law of 
ICardelli et al.l ()1989[ ) with the following dust extinc- 
tion ratios at the redshift of the lens Zd = 0.63 for 
Rv = 3.1 (Galactic extinction): Apeoew/^v = 1.56, 
^F8i4w/^v = 1-14, and ^pieow/^v = 0.41, where A\ 
is the extinction (difference between the observed and 
intrinsic magnitudes) at wavelength A. These dust ex- 
tinctio n ratios ag ree with the values from the extinction 
law in lPeil (|1992f ) to within 1.5%. In order to correct for 



the extinction, we need to know the intrinsic colors of 
the objects (details in Section [SAT]). For each color type 
of object (the lens galaxies, the source galaxy, and the 
AGN of source galaxy), we denote the intrinsic color by 
Qf = (m._F,intrinsic-"iiantrinsic) whcrc = 1 , . . . , iVb is in 
sequence from the reddest to the bluest wavelengths (by 
construction Qi = 0), and A^b is the number of wave- 
length bands used for dust correction. Combining the 
dust extinction ratios and the definition of intrinsic col- 
ors, we can model the observed magnitudes at each image 
pixel in each of the wavelength bands F in terms of Ay 
and the intrinsic magnitude of the reddest wavelength 
band TOi^ntrinsic as 

nip = mi?^obscrvcd = "li, intrinsic + Qf + +"F , (16) 

where kp = Ap j Ay are constants given by the extinction 
law and np va the noise in the data of wavelength band 
F . We can solve for Ay and toi, intrinsic at each image 
pixel by minimizing the following Xdust ^'^^ each pixel: 



Xdust 



E 

F=l 



(mp - mi, intrinsic - Qp - Aykp) . (17) 



We have weighted the images of the different bands 
equally because the uncertainty associated with mp is 
negligible compared to that of Qf, and the uncertainties 
in Qp are of comparable magnitudes for the different 
bands F relative to the reddest. The solution that min- 
imizes xlust is 



Ay = 



mp 



' ^ kpmp + ^ kpQp 
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(18) 



and 



^1. intrinsic 




(19) 

where the sums over F go from 1 , . . . , iVb . We emphasize 
that Equations (fT8|) and (fT9|) give the ^4^/ and mi^ntrinsic 
at each pixel. Since Ay varies from pixel to pixel (de- 
pending on the amount of dust seen in that pixel), the 
various Ay values of all pixels provide a dust map. Sim- 
ilarly, the TOi, intrinsic valucs of all pixels give the dust- 
corrected image in the reddest wavelength band. The re- 
sulting values of mi^intrinsic and the intrinsic colors yield 
the intrinsic (dust-corrected) magnitudes in the other 
bands mj^^ntrinsic where F = 2, . . . , iVb. For any one band 
-F, we can then construct the diagonal dust matrix K in 
Equation (|13p whose nonzero entries are lO"*' '*'^'^'^^' . 

5.4.1. Obtaining the intrinsic colors 

The dust correction method outlined above requires 
the intrinsic colors to be determined from the color maps. 
To construct the color maps, we need to unify the dif- 
ferent resolutions of the images in different bands (due 
to the wavelength dependence of the PSF). We do so by 
deconvolving the F606W, F814W, and F160W images us- 
ing their corresponding PSFs, and reconvolving the im- 
ages with the F814W PSF for each set of the five ACS 
PSFs and the single NICMOS PSF described in Section 
15.31 Reconvolved images are preferred to deconvolved 
images, because the latter show small-scale features (of 
a few pixels' size) that are artificial due to the ampli- 
fication of the noise during the deconvolution process. 
We select the F814W PSF for the reconvolution because 
F814W will be used for the lens potential modeling, due 
to its high S/N compared with F160W and its less severe 
dust extinction compared with F606W. In working with 
the reconvolved images, we assume that the dust varies 
on a scale larger than the F814W PSF, which is true for 
the regions near the Einstein ring. For the deconvolution, 
we use IDL's max_ei itropy iterat i ve rou tine that is based 
on the algorithm bv lHollis et al.l ()1992f ). We were unable 
to deconvolve the ACS F814W image using PSF-f3. This 
suggests that PSF-f3 is a bad model, which we have ex- 
pected due to temporal variations in the PSF. PSF-f3 is 
a single-epoch PSF whereas the F814W image was driz- 
zled from multiple exposures. We, therefore, discard this 
PSF model. 

For each set of PSF models (PSF-drz, PSF-C, PSF- 
Bl, and PSF-B2 for ACS, and TinyTim PSF for NIC- 
MOS), we construct the color maps F606W-F814W, 
F606W-F160W, and F814W-F160W from the recon- 
volved F606W, F814W, and F160W images. Fig. [5] 
shows the three color maps derived for PSF-Bl. Re- 
gions with bluer color slightly west of Gl are shown 
in all three color maps. Since the centroid of this 
blue region is offset from the centroid of Gl, we be- 
lieve that this blue region arises from differential red- 
dening and not from intrinsic color va riations within Gl, 
which is an elliptical galaxy (jSurpi fc Blandford 2003 ) . 
Since elliptical galaxies typically contain little dust. 



iKoopmans fc Fassnachtl (|1999f ) and ISurpi fc BlandfordI 
(2003^ suggested that the dust comes from G2, likely 
a dusty late-type galaxy, through dynamical interaction. 
This may explain why the spectrum of Gl shows sig- 
natures of a young ste llar population plu s a poststar- 
burst population (Dressier fc GunnI Il983t iMv ers et al.l 
[1991 ISurpi fc BlandfordI 120031: iKoopmans eT al. 2003^ 
gas from G2 may have been transferred to Gl, where the 
tidal interactions may have triggered star formation. 

The color maps also show regions of bluer color around 
images C and D, and we again believe that these are 
mostly differential reddening due to the misalignment of 
the image positions and the centroids of these blue re- 
gions, especially in F606W-F160W and F814W-F160W. 
Furthermore, we find more dust at the crossing point of 
the isophotal separatrix (the figure-eight-shaped inten- 
sity contour) of the image pair A-C. This is encourag- 
ing, as lensing models indeed predict the crossing point 
to be closer to image A (see discussion in Section r5.4.2p . 
However, these bluer regions near images C and D may 
also arise from the lensed source being intrinsically bluer 
than the surrounding emission. The F814W-F606W 
color for these blue regio ns is consistent with typical 
star-forming galaxies (e.g. IColeman et"anil980l ). In the 
F606W-F814W color map, there is a faint ridge of redder 
color connecting images A and C. This may be due to the 
asymmetry in the stellar PSF model (with the star po- 
sition not exactly centered within a pixel), which would 
cause the F606W and F814W isophotes to shift relative 
to each other after the deconvolution and reconvolution. 
For the color maps from the other PSF models, we find 
that the color maps from PSF-C and PSF-B2 look sim- 
ilar to that from PSF-Bl with varying amounts of noise 
due to varying brightnesses of the stellar PSFs. PSF-drz 
gave color maps that differ from those from the stellar 
PSFs (PSF-C, PSF-Bl and PSF-B2) because PSF-drz, 
especially in the F606W band, did not exhibit a single 
brightness peak but a string of equal brightness pixels 
at the center due to frame alignment difficulties during 
the drizzling process. This caused the brightest pixels in 
the Einstein ring to shift by ^ 1 pixel after the decon- 
volution and the reconvolution process in F606W, and 
created artificial sharp highlights tracing the edge of the 
ring in the F606W-F814W color map. As will be seen 
in Section 15.61 this leads to PSF-drz and its resulting 
dust map giving a lower goodness of fit in the lens inver- 
sion, and hence being ranked lower compared with other 
models. 

In each of the color maps, we define three color regions 
for the three color components: one within the Einstein 
ring for the lens galaxies (we assume Gl and G2 to have 
the same colors), one for the Einstein ring of the lensed 
extended source, and one for the lensed AGN (core o f the 
extended source). Following IKoopmans et al.l (|2003f l. we 
determine the bluest color within each region, assume 
that this part of the region was not absorbed by dust, 
and adopt this color as the intrinsic color. This assumes 
that each of the three components has a constant intrin- 
sic color. This would allow us to obtain the differential 
reddening for each of the components across the lensed 
image; absolute reddening is not needed because a uni- 
form dust screen does not affect lens modeling. Table 
[3] lists the intrinsic colors for each of the three pairs of 
color maps. The intrinsic colors of F606W-F814W are 
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F606W - F160W 



F814W - F160W 




Fig. 5.— From left to right: the derived color maps F606W-F814W, F606W-F160W, and F814W-F160 using PSF-Bl. 



TABLE 3 

Intrinsic colors of the AGN, Einstein ring, and lens 
GALAXIES IN B1608+656 







F606W-F814W 


F814W-F160W 


F606W-F160W 


PSF-drz AGN 


0.50 


1.4 


1.91 




Ring 


0.70 


1.5 


2.20 




Lens 


0.84 


1.0 


1.88 


PSF-C 


AGN 


0.78 


1.3 


2.10 




Ring 


0.84 


1.5 


2.30 




Lens 


1.04 


1.0 


2.05 


PSF-Bl 


AGN 


0.72 


1.1 


1.85 




Ring 


0.76 


1.3 


2.10 




Lens 


1.04 


0.82 


1.85 


PSF-B2 


AGN 


0.70 


1.17 


1.99 




Ring 


0.80 


1.3 


2.10 




Lens 


1.01 


0.85 


1.92 



Note. — The intrinsic colors are based on color maps derived 
from the four ACS PSF models (PSF-drz (drizzled TinyTim), PSF- 
C (closest star), PSF-Bl (bright star #1), and PSF-B2 (bright star 
#2)) and the single NICMOS TinyTim PSF. The intrinsic colors 
for each of the three color regions are determined from the bluest 
colors in the respective region. The uncertainties on the intrinsic 
colors vary from 0.02 to 0.1. The higher uncertainties are associated 
with the F160W image, which has a lower S/N. 

not identical to the difference between F606W-F160W 
and F814W-F160W, but agree within the uncertainties 
(0.02-0.1). 

5.4.2. Resulting dust maps 

With the intrinsic colors determined for each PSF 
model, we obtain two dust maps (Ay maps) using (1) 
only the ACS F606W and F814W images and (2) the 
ACS F606W and F814W images together with the NIC- 
MOS F160W image. In this way, we can assess whether 
the inclusion of the lower S/N NICMOS image (with the 
much broader PSF) improves the dust correction. 

The left-hand panel of Fig. [6] is the resulting Ay dust 
map derived using PSF-Bl and using images in all three 
bands. The dust map shows the east- west dust lane 
through the system (absorbing light from C, G2, Gl, 
and D) that is visible in the original drizzled ACS F606W 
and F814W images. There is little extinction near im- 
ages A and B, but there are faint rings surrounding the 
images that are mostly due to imperfect F160W decon- 
volution. We note that the low S/N exterior to the Ein- 
stein ring results in the dust map being noisy in this 
area. We make sure that these noisy areas are not in- 



cluded in the Bayesian evidence computations in Sections 
15.61 and [5] The right-hand panel of Fig. [5] is the result- 
ing dust-corrected F814W image that exhibits two signs 
of proper dust correction: the correctly shifted crossing 
point of the isophotal separatrix of the image pair A-C, 
as shown more clearly in Fig. [3 and the smoother lens 
galaxy profiles. As a result of recovering the absorbed 
light, the dust-corrected image has higher intensity val- 
ues than the uncorrected image. Therefore, we create a 
weight map for the dust-corrected image by scaling the 
multidrizzle weight image in order to keep the S/N of 
each pixel the same (before and after dust correction). 
This "dust-corrected weight image" will be used in the 
next section for determining the lens galaxy light. 

The dust maps obtained from the other PSF models 
with or without the inclusion of the NICMOS image show 
similar features except for the following two dust maps. 

1. The ACS-only (no NICMOS) dust map from PSF- 
B2 showed a faint ridge of dust connecting images 
A and C. As explained, this may be due to the 
asymmetrical/bad PSF model. Since the dust map 
otherwise exhibits the correct features, we keep this 
dust map for the next analysis step. 

2. The ACS-only dust map from PSF-drz showed 
prominent artificial lensing arc features due to the 

1 pixel offset in the image positions/arcs in the 
deconvolved and reconvolved F606W and F814W 
images, respectively. Therefore, we discard this 
dust map of the ACS-only images for PSF-drz, but 
keep the dust map derived from using all three 
bands (that includes NICMOS). 

After discarding the ACS PSF-f3 and the ACS-only 
dust map from PSF-drz, we have a total of seven dust 
maps (and resulting dust-corrected F814W images). All 
of these are reasonable dust corrections to use since they 
are derived using representative PSFs and intrinsic col- 
ors. We will compare these dust maps and PSF models 
in Section [521 

5.5. Lens galaxy light 

For each of the seven resulting dust-corrected F814W 
images in Section 15.4.21 and its corresponding PSF, 
we create an elliptical mask for the lens galaxies' re- 
gion that excludes the Einstein ring, and fit the lens 
galaxies' light to elliptical Sersic profiles using GALFIT 
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Fig. 6. — Left-hand panel: the Ay map obtained from dust correction with PSF-Bl using all three bands of images and the intrinsic 
colors listed in Table |3] The galactic dust extinction law was assumed. The dust lane through images C, G2, Gl, and D is visible. 
Right-hand-panel: dust-extinction-corrected F814W image using PSF-Bl and the three-band dust map in the left-hand panel. Compared 
to the right-hand panel in Fig. \3\ the light profile of Gl is more elliptical and the crossing point of the isophotal separatrix of images A 
and C has shifted toward A after the dust correction. 

iKoopmans et al.l (|2003f ) found Gl to be well described by 
n = 4 (de Vaucouleurs profile). With the dust-corrected 
weight image, we obtain a reduced value for each of 
the profile fittings. For each dust-corrected F814W im- 
age, we pick the Sersic index pair with the lowest reduced 
from the fit (top two pairs in the case of PSF-drz) and 
list it in Table [H As an illustration, Fig. [8] shows the 
GALFIT Sersic (7101,^02) = (3,4) results of the dust- 
corrected F814W image using the three-band dust map 
from PSF-Bl. The dark (light) patches in the upper 
right-hand corner of the middle (right-hand) panel re- 
sult from the noisy dust map due to low signal to noise 
in this area. Apart from this area and the lens galaxies' 
cores, most of the observed lens galaxies' light matches 
the dusted Sersic profiles in the middle panel, as shown 
in the residual map in the right-hand panel. The misfit 
near the cores could be due to intrinsic color variations 
in the lens galaxies, the dust screen assumption, PSF 
imperfections, and/or inapplicability of a single Sersic 
model at the center. Nonetheless, accurate light fitting 
near the cores of the lens galaxies is not important; it 
is for the isophotes of the Einstein ring that we need to 
have accurate dust and lenses' light corrections for the 
lens modeling. For the ring, the dust screen assumption 
in our approach is valid. 

5.6. Comparison of PSF, dust, and lens galaxy light 

models 

Following the method outlined in Section 21 we can 
use the Bayesian evidence from the source-intensity re- 
construction to compare the different PSF (B), dust (K) 
and lens galaxy light (Z) models. For each set of B, K, 
and I, we obtain the corresponding galaxy-subtracted 
F814W image (d — B • K • Z) that is analogous to the one 
shown in the right-hand panel of Fig. [51 We then make 
a 130 X 130 pixel cutout of the 0.03" galaxy-subtracted 
image and use the SPLEl+D (isotropic) lens potential 
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Fig. 7. — Crossing isophotes of the B1608-I-656 Einstein 
ring. Shown here is the dust-corrected and galaxy-subtracted 
F814W image (solid contours), with th e critical curves of the 
SPLEl-(-D (isotropic) potential model I jKoopmans et aTl 120031 ) 
overlaid (dashed curves). The inset shows a "zoomed-in" view 
of the region between images C and A; here, the dotted curves 
in the zoomed-in panel are the intensity contours of the galaxy- 
subtracted F814W image without dust correction. After dust cor- 
rection, the crossing point of the isophotal separatrix (the center 
of the figure-eight isophote) is shifted toward the critical curve, 
indicating successful dust correction. 

(|Peng et al.ll2002D . In particular, we impose the Sersic 
indices to be one of the following pairs: (nGi,«G2) = 
(1,1), (2, 2), (3, 3), (3, 4), (4, 3), (4, 4). There are more 
pairings with n — S a nd ti = 4 sin c e pre vious 
works by, for examples, iBlandford et al.l (|2001[ ) and 
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Fig. 8. — Sersic lens galaxy light profile fitting to the dust-corrected F814W image, with PSF-Bl and its corresponding three-band dust 
map, using GALFIT. The left-hand panel shows the best-fit Sersic light profiles with Sersic indices {nQi,nQ2) = (3,4). The middle panel 
shows the dust-extincted galaxy light profiles, which is the left-hand panel with the dust extinction added back in. The right-hand panel 
shows image residual (difference between the F814W drizzled image in Fig. |3] and the middle panel) with misfit near the cores of the lens 
galaxies of ^ 25 — 35%. 



TABLE 4 

Best-fitting Sersic light profiles for the lens galaxies 

Gl AND 02 FOR the SEVEN DIFFERENT DUST-CORRECTED F814W 
IMAGES BASED ON DIFFERENT PSF AND DUST MAPS 



PSF 


Dust Map 


Sersic Indices (riQi. 


,7102) Reduced Xit„3 u^-ht 


drz 


Three- band 


(3,4) 


4.48 


drz 


Three- band 


(3,3) 


4.53 


C 


Three-band 


(3,4) 


5.11 


C 


Two-band 


(3,3) 


6.13 


Bl 


Three- band 


(3,4) 


5.53 


Bl 


Two-band 


(2,2) 


7.16 


B2 


Three- band 


(2,2) 


5.95 


B2 


Two-band 


(2,2) 


8.19 



Note. — In the PSF column, "drz" = drizzled TinyTim, "C" 
= closest star, "Bl" = bright star #1, and "B2" = bright star #2. 
In the dust map column, "two-band" represents the dust map ob- 
tained from just the two ACS bands, and "three-band" represents 
the dust map obtained from the two ACS and the one NICMOS 
band. 



model in iKoopmans et aLl (|2003[ ) , which is the most up- 
to-date simply-parameterized lens potential model for 
B1608-I-656, for the source-intensity reconstruction. Due 
to the source and image pixelizations, we include the re- 
gridding error (described in Section r5.2.ip in the image 
covariance matrix. 

We select an annular region enclosing the Einstein 
ring, and use the data inside this region for the source- 
intensity reconstructions for each set of the PSF, dust, 
and lens galaxy light models. The source grid, which 
we fix to have 32 x 32 pixels, has pixel sizes that are 
~ 0.022" to cover the marked elliptical annular region 
when mapped to the image plane. This is sufficient for 
achieving reasonable reconstructions and is computation- 
ally manageable. In the inversions, we reduced the PSF 
to 15 X 15 pixels to keep the matrices such as B reason- 
ably sparse for computing speed. We try three forms of 
regularization: zeroth-order, gra dient and curvature (e.g. 
Appendix A of lSuvu et a"ni2006f ). 

Table [5] lists the suite of PSF, dust, and lens galaxy 
light models we obtained in the previous section. We la- 
bel the different models by numbers from 1 to 11 in the 
left-most column. Models 9 and 10 correspond to the 
mixing of the dust maps and lens galaxy light profiles 
derived from PSF-Bl with PSF-C and vice versa. Model 



11, which is included as a consistency check, uses PSF- 
Bl and has no dust correction applied. For each set of 
models, the source-intensity distribution for B1608+656 
is reconstructed. As an example. Fig. [9] shows the results 
of the source reconstruction with gradient regularization 
using PSF-Bl, its corresponding three-band dust map, 
and the resulting Sersic (nGi,'raG2) = (3,4) galaxy fight 
profile. The top left-hand panel shows the reconstructed 
source-intensity distribution that is approximately local- 
ized, an indication that the lens potential model is close 
to the true potential model. In the top-middle panel, 
the pixels that are far from the source but are inside the 
caustics have lower la error values than the pixels outside 
the caustics due to higher image multiplicity inside the 
caustics. The bottom right-hand panel shows significant 
image residuals (the reduced is 1-9 inside the annu- 
lus), a sign that the PSF, dust, lens galaxy fight, and/or 
the lens potential models are not optimal. In Section 
[6l we will use the pixelated potential correction scheme, 
which is more suitable for interacting galaxy lenses, to 
improve the simply-parameterized SPLEl+D (isotropic) 
model. 

The source-intensity reconstructions using other PSF 
and lens galaxy light models with three-band dust maps 
give overall similar inverted source intensities and im- 
age residuals, but the source intensities can be more or 
less localized and the magnitude and structures of the 
image residuals vary for different model sets. However, 
the source-intensity reconstructions using models with 
two-band dust maps result in source intensities that are 
not localized, and the image residuals show surpluses 
of light in the ring region and deficits of light in the 
lens galaxy region (corresponding to the color regions we 
marked for obtaining the intrinsic colors). The reason is 
that with only two bands, the resulting dust-corrected 
F814W image is highly sensitive to relative shifts be- 
tween the F606W and F814W images (due to an imper- 
fect PSF model, deconvolution, and reconvolution) and 
errors in the modeled intrinsic colors. The abrupt change 
in the modeled intrinsic colors across the boundaries of 
the color regions creates artificial surpluses or deficits of 
dust-corrected light near the boundaries. This effect is 
suppressed with the addition of the F160W image be- 
cause the F160W image suffers relatively little extinc- 
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TABLE 5 

PSF, DUST, AND LENS GALAXIES' LIGHT MODEL COMPARISON 
BASED ON BAYESIAN SOURCE INVERSION 





PSF 


Dust Map 


Sersic (nGii™G2) 1 


Fteg. Type Log Evide: 
(xlO*) 


1 


drz 


Three-band 


(3,4) 


grad 


1.49 


2 


drz 


Three-band 


(3,3) 


grad 


1.48 


3 


C 


Three-band 


(3,4) 


grad 


1.60 


4 


C 


Two-band 


(3,3) 


zeroth 


1.40 


5 


Bl 


Three-band 


(3,4) 


grad 


1.56 


6 


Bl 


Two-band 


(2,2) 


zeroth 


1.10 


7 


B2 


Three-band 


(2,2) 


grad 


1.55 


8 


B2 


Two-band 


(2,2) 


zeroth 


1.23 


9 


C 


Bl/three-band 


(3,4) 


zeroth 


1.56 


10 


Bl 


C /two- band 


(3,3) 


zeroth 


1.36 


11 


Bl 




(3,4) 


zeroth 


1.27 



Note. — For each set of the P SF, dust, and lens galaxy light 
profiles derived in Sections l5.3tf5.5l the Bayesian log evidence value 
is from the source-intensit y reconstru ction using the SPLEl+D 
(isotropic) model in Koop mans et al] (2003). The uncertainty in 
the log evidence value due to source pixelization is ~ 0.03 X 10^. 
In the PSF column, "drz" = drizzled TinyTim, "C" = closest star, 
"Bl" = bright star #1, and "B2" = bright star #2. In the dust 
map column, we list "two-band" for the dust map obtained from 
just the two ACS bands and "three-band" for the dust map ob- 
tained from the two ACS and the one NICMOS band. Unless 
otherwise indicated in the dust map column, the PSF model used 
for the dust map derivation was the same as the corresponding 
PSF model in the PSF column that was used for source recon- 
struction. For completeness, we restate the Sersic indices in Table 
[4] in the lens galaxy light profile column, which were obtained for 
the corresponding dust maps and PSFs specified in the dust map 
column. The column of "Reg. Type" refers to the preferred type of 
regularization for the source reconstruction, based on the highest 
Bayesian evidence value. It can be one of three types: zeroth-order, 
gradient, or curvature. 

tion, and the error due to misalignment in the images 
and abrupt change in the modeled intrinsic colors is re- 
duced when one has more than two bands. A few tests 
suggest that the error in the dust-corrected image due to 
the range of intrinsic colors listed in Table [3] overwhelms 
the error associated with the foreground dust screen as- 
sumption for the lens galaxy light. 

The source- intensity reconstruction in Model 11 with 
no dust correction shows significant image residuals in 
the extended ring, with overall surpluses of light sur- 
rounding images A and B and deficits surrounding im- 
ages C and D. The source intensity is also poorly re- 
constructed, being nonlocalized and noisy. This illus- 
trates the importance of dust correction for the initial 
SPLEl+D (isotropic) model. 

5.6.1. Results of Comparison 

Table [5] summarizes the results of model comparison. 
The "Reg. Type" column denotes the preferred type of 
regularization for the source recon struction based on the 
highest Bayesian evidence value (|Suvu et al.l 120061 ). It 
can be one of the three types that we use: zeroth-order, 
gradient, and curvature. The last column lists the log 
evidence values from the inversions. Assuming the differ- 
ent models to be equally probable a priori, we use these 
evidence values for model comparison. The log evidence 
values range from 1.1 x 10^ to 1.6 x lO'' with uncertainties 
of - 0.03 X 10"* due to the finite source resolution. 

The list shows that the three-band dust models have 
higher evidence values than the two-band dust models. 



This is attributed to the two-band dust models showing 
image residuals from the aforementioned artificial sur- 
pluses and deficits of light in the dust-corrected image. 
The inclusion of the NICMOS F160W image to the ACS 
images (F606W and F814W) for the dust correction is, 
therefore, crucial due to (1) the proximity in the wave- 
lengths of the ACS images and (2) the reduction in the 
error associated with image misalignments and simplistic 
intrinsic color models. 

The three-band dust models also have higher evidence 
values than the no-dust model. This further validates 
the three-band dust correction, as already indicated by 
Fig. [T] The evidence value of the no-dust model is in 
midst of the values for the two-band dust models, sug- 
gesting that the systematic effects in the two-band dust 
maps are comparable to the corrections that the dust 
maps are meant to achieve, thus leading to little im- 
provement in the lens modeling. 

The difference between the evidence values in Mod- 
els 1 and 2 (where the models only differ in the Sersic 
light profiles) is, in general, smaller than the difference 
between one of these two models and another PSF/dust 
model. Therefore, the source reconstruction (part of lens 
modeling) seems to be less sensitive to the galaxy light 
profiles than the PSF/dust models. This is in agreement 
with our finding that the dust-corrected image depends 
more on the PSF and the intrinsic color models than on 
the form of the lens galaxy light and the dust associated 
with the lens galaxies. Models 1 and 2 with PSF-drz have 
log evidence values on the low side of the collection of 
models with three-band dust maps, which was expected 
with PSF-drz not having a single brightness central peak 
due to misalignments in the drizzling process. The other 
models with three-band dust maps (Models 3, 5, 7 and 9) 
have effectively the same evidence values within the un- 
certainties. The models with two-band dust maps (Mod- 
els 4, 6, 8, and 10) lead to a range of evidence values with 
the PSF-C dust map being preferred to the PSF-Bl and 
PSF-B2 dust maps. The two-band dust maps suggest 
that the shape of the primary maximum in the PSF is 
more important in the modeling than the inclusion of sec- 
ondary maxima since PSF-C, which we expect to have 
a more accurate shape for the primary PSF maximum 
than PSF-Bl and PSF-B2, does not have the secondary 
maxima whereas PSF-Bl and PSF-B2 do. The asymme- 
try in the PSF due to the star not being centered on a 
single pixel may also explain the less-preferred PSF-Bl 
and PSF-B2. The distinction between the various stellar 
PSFs vanishes with the three-band dust maps, possibly 
due to the higher amount of noise in the three-band dust 
map with the inclusion of the lower S/N NICMOS image. 
In this case, the effects of the PSF variations across the 
field are suppressed. 

All models preferred either the zeroth-order or gradi- 
ent form of regularization, but never the curvature form; 
however, we mention that the difference in the log evi- 
dence values between the different regularization schemes 
3 X 10^) are on the order of the uncertainties due 
to source pixelization, and the resulting reconstructions 
for different types of regularizations are almost identical. 
This is because differences in evidence values between 
models are currently dominated by changes in goodness 
of fit rather than subtle differences between the prior 
forms. Only when the image residual is reduced will 
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Fig. 9. — Source-intensity reconstruction of B1608+656 (assuming model #5 in Table [St. Top panels from left to right: the reconstructed 
source-intensity distribution with the caustic curves of the SPLEl-l-D (isotropic) model overlaid, the la error for the source-intensity values, 
the S/N of the reconstruction (i.e., the ratio of the top left-hand to the top- middle panel). Bottom panels from left to right: the observed 
F814W galaxy-subtracted image, the reconstructed image using the reconstructed source in the top left-hand panel, and the normalized 
image residual (i.e., the map of the difference between the bottom left-hand and the bottom middle panels, in units of the estimated pixel 
uncertainty from the data image covariance matrix). 



the prior (regularization) begin to play a greater role in 
avoiding the reconstruction to fit to noise in the data by 
keeping the source model simple. 

This section has illustrated a method of creating sen- 
sible PSF, dust, and lens galaxy light models for the 
gravitational lens B1608-I-656. We have obtained a rep- 
resentative sample of models, and have compared these 
models quantitatively. This collection of PSF, dust, and 
lens galaxy light models leads to image residuals that 
cannot be beaten down further unless we improve the 
SPLEl-l-D (is otropic) simply-paramet erized lens poten- 
tial model by iKoopmans et al.l (|2003D to take into ac- 
count the two interacting galaxy lenses. The pixelated 
potential reconstruction of B1608-I-656 is the subject of 
Section [51 

6. PIXELATED LENS POTENTIAL OF B1608-f656 

We reconstruct the lens potential for each set of the 
PSF, dust, and lens galaxies' light in Models 2-11 in 
Table [HI We describe in detail the potential reconstruc- 
tion using Model 5, which is one of the four models that, 
within the uncertainties, have the highest Bayesian evi- 
dence value before the potential correction. At the end 
of the section, we discuss the differences in the potential 
reconstruction between the various PSF, dust, and lens 
galaxies' light models. 

To reconstruct the lens potential of B16084-656, we 
use a 130 x 130 pixel cutout of the drizzled ACS/F814W 
image with the pixel size 0.03" shown in Fig. \3\ The 
galaxy-subtracted F814W image {— d — B ■ K ■ I) is a 
130 X 130 subimage of the right-hand panel in Fig. [8] 



with 200 X 200 pixels. 

We follow the potential reconstruction method that 
was shown to succeed in Section [3l For the initial lens 
potei itial model, we us e the S PLEl-l-D (isotropic) model 
from iKoopmans et al.l ()2003( ) . We perform nine itera- 
tions (labeled as 0-8) of pixelated potential corrections 
on B1608-f656. For each iteration, we first reconstruct 
the source intensity on a 32 x 32 grid with pixel sizes 
of 0.022". The source region is chosen so that it maps 
to a completely joined annulus on the image plane (so 
that we can determine the relative potential difference 
between images). As in Section \5M the PSF is reduced 
to a 15 X 15 matrix to keep the inversion matrices sparse 
(and computation time low). Furthermore, we use only 
the curvature type of regularization for the source recon- 
struction to reduce computation time and to have regu- 
larized source- intensity gradients for the potential correc- 
tions. The source inversions are over-regularized in the 
early iterations to ensure a smooth resulting source for 
taking gradients. The source over-regularization factors 
start at 1000 and are gradually decreased to 1 at iter- 
ation=8. With the resulting source-intensity gradients 
and intensity deficits from the source reconstruction, we 
perform the potential correction on a grid of 30 x 30 pix- 
els. We use the curvature form of regularization for each 
potential correction iteration. To keep the corrections 
linear, the potential corrections are also over-regularized 
with the regularization constant (/i) set at 10 times the 
value where iiEg^ peaks, as in Section [31 The corrected 
potential has the midpoints in the left, bottom, and right 
parts of the annular reconstruction region fixed to the 
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initial potential model. 

The top row of Fig. [TOl shows the results of iteration=0 
of source and potential reconstruction. The left-hand 
panel shows the reconstructed source that has been over- 
regularized by a factor of 1000. The caustics are those 
of the initial SPLEl-l-D (isotropic) model. The source is 
localized and compact, a sign that the initial SPLEl-l-D 
(isotropic) potential we started from is close to the true 
model. The middle-left panel shows significant image 
residuals that are to be corrected, especially near the 
cores of the images due to the over-regularization of the 
source-intensity distribution. The annular region marks 
the region of data that we use for the evidence computa- 
tion in the final iteration of source reconstruction. Using 
the gradient from the reconstructed source and the in- 
tensity deficit, the middle-right panel shows the potential 
reconstruction of iteration=0 and the right-hand panel 
shows the fraction of the accumulated potential correc- 
tions relative to the initial model. 

The middle row of Fig. [10] shows the result of 
iteration=2 of source and potential reconstruction. 
Compared to iteration=0 that has the same over- 
regularization factors, the source reconstruction is 
slightly smoother, the image residual has decreased, and 
the potential correction is not as large. 

In the iterations from 3 to 8, the potential corrections 
are small; therefore, the source reconstruction and image 
residual change only gradually during these iterations. 
The bottom row of Fig. [10] shows the results of itera- 
tion=8 (the last iteration). The reconstructed source in 
the left-hand panel has more background noise than iter- 
ation=2 because the source is now optimally regularized. 
The source after the potential correction is more localized 
than that before the potential correction in Fig.[9l which 
is a good indication that the reconstructed potential is 
closer to the true potential (up to the mass-sheet degen- 
eracy). The normalized image residual in the middle- left 
panel shows an overall decrease in the image residual 
compared with that in Fig. [9l There remains intensity 
deficit near the image locations since the intensities of 
point-like images do not generally match due to the time 
delays and variability. This misfit can also be due to 
the undersampling of the PSF. There is also remaining 
image residual near image C that is likely due to imper- 
fections in the dust correction. Nonetheless, the reduced 
^ inside the annulus is 1.1 (keeping in mind the unsealed 
nature of our image pixel uncertainties). The right-hand 
panel in the bottom row of Fig. [10] shows that the final 
accumulated potential correction relative to the initial 
model is only ^ 2%. The structure of the accumulated 
potential correction may seem to resemble the simulation 
in Fig.[l] however, this does not mean that the potential 
correction in B1608-I-656 corresponds to a mass clump 
as in the simulation. We point out that the maps of the 
potential corrections that generally look similar (due to 
the fixing of the three points in the annulus) may lead to 
very different convergence maps. 

The potential reconstruction described above is for 
Model 5. After repeating the procedure for the other 
models, we find that the image residual and source re- 
construction in the final iteration for the other three- 
band dust models are similar in feature to Model 5. In 
contrast, the two-band dust maps' source reconstruction 
continue to show nonlocalized source intensities with spu- 



rious light pixels outside of the main component. Fur- 
thermore, parts of the artificial surpluses or deficits of 
the dust-corrected light near the color boundaries remain 
after the potential correction. For Model 11 with no 
dust, the potential corrections lead to a localized source 
with image residuals that show misfit only near the image 
cores and locations of the dust lane. 

These results of the potential reconstructions can be 
quantified using the Bayesian evidence values from the 
source reconstruction of the final corrected potential. Ta- 
ble [6] lists the evidence values for Models 2 to 11. The 
uncertainties in the evidence values are due to the source 
pixelization and the possible range of over-regularization 
for the source-intensity reconstruction and lens poten- 
tial correction. We explored over-regularization factors 
in the range between 1 and 1000 for the source intensity, 
and various factors within 30 of the regularization con- 
stant /i that corresponds to the peak of iiEsii, for the po- 
tential correction. The table shows that the three-band 
dust maps are consistently ranked higher than the two- 
band dust maps, indicating the importance of including 
the NICMOS image for the dust correction. All three- 
band dust maps give the same evidence values within 
the uncertainties, indicating that the various PSF and 
three-band dust models are all acceptable. Further- 
more, the resulting Fermat potential differences between 
the images for these models agree within the uncertain- 
ties. Model 11 with no dust leads to the same evidence 
value as the values for three-band dust models. The 
predicted Fermat potential differences between the im- 
ages for Model 11 are also in similar ranges as those of 
the three-band dust models. This shows that the global 
structure of the lens potential remains relatively intact 
after the dust correction to give similar predicted Fer- 
mat potential values, even though local pixelated poten- 
tial corrections are fiexible enough to mimic the effects of 
dust extinction. It is encouraging that the dust extinc- 
tion in B1608-I-656 does not alter the surface brightness 
in a systematic way as to change the global structure of 
the lens potential. This robustness in the global struc- 
ture of the lens potential is important for inferring the 
value of the Hubble constant. 

In summary, for the top PSF, dust, and lens galaxies' 
light models, the pixelated potential correction scheme 
was successfully applied to B1608+656 leading to poten- 
tial corrections of ~ 2%. This is only a small amount of 
co rrection, indicating tha t the smooth potential model 
in iKoopmans et all (|2003f l is remarkably good. The re- 
sulting source is also well localized. 

This completes the dissection of the gravitational lens 
B1608-I-656. The image residual is not fully eliminated 
possibly due to imperfect PSF, dust, lens galaxies' light 
modeling, variability in the point source intensities, finite 
source resolution, and/or undersampled PSF. In Paper 
II, we use the models in Table [6] to derive Hq and to 
estimate its uncertainty associated with the modeling. 

7. MASS AND LIGHT IN THE B1608+656 LENS SYSTEM 

The clean dissection of the lens system in the previous 
sections allows us to study the mass and light in Gl and 
G2. 

Since the amount of potential correction is small, we 
can safely neglect the implied corrections when estimat- 
ing the mass associated with the lens galaxies. Inte- 
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Fig. 10. — Results of the iterative pixelated potential reconstruction of B1608+656. Top row, which shows the results of iteration=0: 
the left-hand panel shows the over-regularized curvature source reconstruction, the middle-left panel shows the normalized image residual 
(in units of the estimated pixel uncertainty from the data image covariance matrix) based on the inverted source, the middle-right panel 
shows the potential corrections on an annulus using the curvature form of regularization, and the right-hand panel shows the accumulated 
potential corrections relative to the initial potential model. The source is localized, an indication that we are close to the initial model, but 
not at the true potential model because significant image residuals are present. Middle row, which shows the results of iteration=2: the 
panels are arranged in the same way as in the top row. Compared to iteration=0, the image residuals and the potential corrections are both 
smaller. Bottom row, which shows the results of iteration=8: the panels are arranged in the same way as in the top row. The resulting 
source of the corrected potential is more localized than that of the uncorrected potential in Fig. (9] and the image residual corresponds to 
a reduced of 1.1. The accumulated potential correction is only 2%. 



grating the SPLEl+D (isotropic) surface mass density 
of each of the lens galaxies within their respective Ein- 
stein radii, the mass of Gl enclosed within te-gi — 0.81" 
is Mgi = 1.9 X 10"/i"^ Mq, and the mass of G2 enclosed 
within rE;G2 = 0.28" is Mg2 = 2.8 x lO^^h'^ Mq . Our 
dust correction enables us to recover the intrinsic lumi- 
nosity of the lens galaxies. We use the fitted Sersic light 
profiles to estimate the luminosity of Gl and G2. Inte- 
grating the flux of Gl and G2 within rE;Gi and rE;G2, 
respectively, the total mass to rest-frame B-band light 
ratio of Gl is (M/Lb)gi = (2.0 ± 0.2)hMQ L^^^ and of 

G2 is (M/Lb)g2 = (1.5±0.2)/iMqLb_^q. The total mass 
and M/L of Gl are consist ent with those from earlier 
works on B1608+656 (e.g., iFassnacht eranil996D after 
taking into account the difference in the Einstein radius 
(due to the different number of components in the lens 
model) and the lowered M/L as a result of the dust cor- 



rection. Th e M/L ra tio of Gl is low compared to the lens 
galaxies in iTreu fc' Koopmans (2004), which have M/L 
in the range ^S-SMq/Lb,©. This is consistent with the 
spectrum of Gl showing signatures of both young and 
poststarburst populations, since these types of galaxies 
can have lower M/L ratios by a factor of ^ 10 com- 
pared to other E/SO galaxies at similar redshifts (e.g., 
Ivan Dokkum fc Stanford! l2003l ). Therefore, even though 
B1608-I-656 co nsists of two interacting galaxy lenses that 
he in a group (jFassnacht et al.l [20061 ) . the M/L ratio of 
Gl is consistent with those in noninteracting lens sys- 
tems. 

8. CONCLUSIONS 

In this paper, we have described and tested an iterative 
and perturbative lens potential reconstruction scheme 
whose accuracy in the recovered lens potential is in prin- 
ciple solely limited by the noise in the data, provided 
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TABLE 6 

Ranked model comparison after potential reconstruction 



ivioaei 




dust 


log evidence 
(XiU ) 


5 


Bl 


1 nree- band 


1.77 ± U.U5 


9 


C 


B 1/ three-band 


1.76 ±0.04 


3 


C 


Three- band 


1.76 ±0.05 


11 


Bl 




1.76 ±0.05 


2 


drz 


Three-band 


1.75 ±0.05 


7 


B2 


Three-band 


1.75 ±0.05 


10 


Bl 


C / two-band 


1.61 ±0.05 


4 


C 


Two-band 


1.58 ±0.05 


6 


Bl 


Two-band 


1.41 ±0.05 


8 


B2 


Two-band 


1.40 ±0.05 



Note. — In the PSF column, "drz" = drizzled TinyTim, "C" 
= closest star, "Bl" = bright star #1, and "B2" = bright star #2. 
In the dust map column, "two-band" represents the dust map ob- 
tained from just the two ACS bands, and "three-band" represents 
the dust map obtained from the two ACS and the one NICMOS 
band. The uncertainty in the log evidence from the source-intensity 
reconstruction is due to the source pixelization and the possible 
range of over-regularization for the source-intensity reconstruction 
and lens potential correction. Within the uncertainties. Models 5, 
9, 3, 11, 2 and 7 have the highest evidence values. Note that the 
three-band dust maps are ranked higher than the two-band dust 
maps. 

we have extended sources giving well connected ring-like 
images. The method is based on a Bayesian analysis, 
which provides a quantitative approach for comparing 
different models of the various constituents of a lens sys- 
tem: PSF, dust, lens galaxy hght, and lens potential. We 
applied this method to the gravitational lens B1608+656 
with deep HST ACS observations. We presented an im- 
age processing technique for obtaining a suite of PSF, 
dust, and lens galaxies' light models, and compared 
these models quantitatively. For each model, we re- 
constructed the lens potential on a grid of pixels, using 
the simply-pararneterize d SPLEl+D (isotropic) model in 
iKoopmans et al.l (|2003[ ) as our initial model. The re- 
constructions for the models with three-band dust maps 
were deemed successful in that they led to an accept- 
able level of image residual and a well-localized inferred 
source-intensity distribution. 
From our analysis, we draw the following conclusions. 

1. The potential reconstruction method, which simul- 
taneously determines the extended source intensity 
and the lens potential distributions on grids of pix- 
els, can correct for potential perturbations that are 
<5%. 

2. The mass-sheet degeneracy is broken in the po- 
tential corrections by choosing forms of regulariza- 
tion that suppress large deviations from the initial 
(mass-constrained) model unless the data require 
them. 

3. The NICMOS F160W image is needed to comple- 
ment the ACS F606W and F814W images for dust 
correction in order to avoid systematic errors. 

4. The level of potential correction required in 
B1608+656 was found to be ~ 2%, validating 
the use of the simpl y-parameterized model of 
IKoopmans et al.l (|2003[ ). 



5. The effect of dust extinction does not alter the 
global structure of the lens potential, and hence 
the predicted Fermat potential differences between 
the images. 

6. The mass and M/Lb of G1 inside te — 0.81" are 
1.9xlO"/i"iMo and (2.0±0.2)/i Mq Lg^^, respec- 
tively. These values are consistent with the spec- 
tral type of this galaxy, and previous less accurate 
estimates of its M/L ratio. 

Although the pixelated potential reconstruction 
method can be applied to any lens system with an ex- 
tended source-intensity distribution, it is particularly 
useful for measuring Hq in time-delay lenses. B1608+656 
is the only four-image gravitational lens system that have 
all three independent relat ive time delays measured wit h 
errors of a few percent (|Fassnacht et al.l I1999I . I2002f) . 
However, current and future imaging surveys (such as the 
Canada-France-Hawaii Telescope (CFHT) Legacy Sur- 
vey, the Panoramic Survey Telescope & Rapid Response 
System, the Large Synoptic Survey Telescope, and the 
Joint Dark Energy Mission) either are or soon will be 
producing many more lenses: we can anticipate building 
up a sample of lens systems that can be fruitfully studied 
using the methods we have developed. 
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APPENDIX 

THE MATRIX OPERATOR FOR PIXELATED POTENTIAL CORRECTION 

A comparison of the potential correction Equation ([3]) with its matrix form in Equation ^ shows that the matrix 
operator t needs to include the PSF blurring, the reconstructed source-intensity gradient, and the gradient operator 
that acts on the potential perturbations Sip. We will consider each of these in the reverse order. 

Before discussing the gradient operator, we need to define the domain over which the gradient operates. Recall that 
the potential corrections are obtained on an annular region that contains the Einstein ring of the lensed source. This 
region was obtained by tracing all the potential pixels back to the source plane (from the lens equation) and seeing 
which ones land on the finite source region of reconstruction. Only these potential pixels that trace back to the finite 
source region will have values of the source-intensity gradient for potential correction via Equation ([3]). These pixels 
tend to mark an annular region. Therefore, we need to find the gradient operator on this annular region for Sip. 

To construct the gradient operator, we use finite differencing to obtain numerical derivatives. For simplicity, first 
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Fig. Al. — Typical annular region for potential c orre ctions and the form of dfij/dxi for each pixel. The blank pixels use the 
i = 2, . . . , M — 1 expressio n fo r dfij /dxi in Equation IIAll l. Th e pix els with "e" are edge pixels that use the i = 1 or i = M expressions 
for dfij/dxi in Equation I IAll l. The shaded pixels use Equation IIA2I I for dfij /dxi. The hashed pixel is an example of an "exposed" pixel 
with no adjacent pixel in the xi direction. 

consider a, M x N rectangular grid with xi and X2 as axes and (i, j) as pixel indices (typically M N ^ 30). In this 
case, the partial derivatives of a function fij defined on the grid are: 



9kj 
dxi 



^(-3/i,,+4/2,,-/3.,) if* = l 

2ijr(/m,, if» = 2,...,Af-l 

2S^ifM-2.j - 4/m-i,j + 3/Af j) if i Af 



% = <! 2K^Mu+i kj-i) if J - 2, . . . , iV - 1 , (Al) 



dx2 



2A 



hrMN-2 - ^kN-l + 3/^,iv)if J = N 



where Axi and Aa;2 are, respectively, the pixel sizes in the xi and X2 directions. For the annular region of potential 
corrections, we only need to elaborate slightly on Equation (jAl[) . Fig. lAll shows a typical annular region and the types 
of pixels when numerically differentiating in the Xi direction. The edge pixels of the annulus, which are denoted by 
"e" in the figure for the xi direction, are treated as though they are like the edge pixels of the rectangular grid (so 
that the i = 1, i = M, j = 1, or j = N expressions are used) when the edge pixels are adjacent to at least two other 
pixels in the annulus in the direction of which the numerical derivative is taken. If an edge pixel of the annulus is only 
adjacent to one other pixel in the direction of which the numerical derivative is taken, such as the shaded pixels in the 
figure for the xi direction, then we construct the gradient by taking the difference between the two and dividing by 
the pixel size. For example, if fij is at the edge, and fi+i.j is also in the annulus (which will have to be an edge pixel 
if fi+2,j is not in the annulus), then the numerical derivatives in the xi direction for both /j j and fi+ij are 

^fij _ fi+i-j ~ fi,j (A2) 

dxi Axi ' ^ ' 

A similar equation applies for the X2 direction. If an edge pixel in the annulus is "exposed" in the sense that in one 
of the directions xi or X2, it has no adjacent pixels in the annulus, then this pixel is removed from the annular region 
of reconstruction as no numerical derivative can be formed. An example of an "exposed" pixel in the xi direction is 
the hashed pixel in the figure. Following the above prescription, we can obtain the values {dfij/dxi, dfij/dx2) of all 
the pixels in the annulus in terms of values of the function in the annulus fki- Factoring out the fki values, we 
obtain the gradient operator defined as two matrices: Di for d/dxi and D2 for d/dx2- 

To conform to the data grid (since the image residual and image covariance matrix is defined on the data grid), 
we use bilinear interpolation. We overlay the data grid on the coarser grid, and for every data pixel that lies inside 
the annular region on the coarse grid, we bilinearly interpolate to get, effectively, gradient operators on the data grid. 
This gives us an x N-p matrix G where each row (corresponding to a data pixel that lies within the annulus) 
has four nonzero values that correspond to the coefhcients of bilinearly interpolating among the four coarse potential 
pixels surrounding this data pixel. Associated with each data pixel are the source-intensity gradient values (dl /df3i 
and dl/d(32) that were obtained by mapping the data pixel back to the source plane using the lens equation, and 
interpolating on the reconstructed source-intensity gradient on the source grid. We define matrices Gi and G2 as the 



DISSECTING THE GRAVITATIONAL LENS B1608+656. I. 



25 



matrix G multiplied by the source-intensity gradient components dl /d[3i and 81/0^2, respectively. By definition, Gi 
and G2 are also Ad x A^p matrices. 

Lastly, we repres e nt th e PSF as a blurring matrix (operator) B that is of dimensions A'd x Ad (see e.g.. Section UJ 
iTreu &: Koopmani (|2004f )). Note that this matrix B is different from the matrix in Section [2.2.11 that is the Hessian 
of the Ed . 

Combining all the pieces together, the matrix operator t is 



which is of dimensions A'd x A'p. 

For the gravitational lens system B1608-I-656, we also need to include the effects of dust extinction, which we express 
as a diagonal matrix K. Tracing back along the light rays, we encounter the dust immediately after the PSF blurring 
(for the light from the lensed source). Therefore, we include it in Equation (jA3p after B to get the following expression 
for the matrix operator t that includes dust: 



t = B Gi Dl + B G2 D2 



(A3) 



t = B K Gi Dl + B K G2 D2. 



(A4) 



